Changes in / [2fd0de0:6726a3a]


Ignore:
Location:
doc/theses/thierry_delisle_PhD/thesis
Files:
8 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/fig/idle.fig

    r2fd0de0 r6726a3a  
    88-2
    991200 2
    10 6 5919 5250 6375 5775
    11 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5409.011 6102 5410 6147 5364 6192 5410
    12 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5410.000 6010 5410 6147 5273 6284 5410
    13 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    14          6010 5410 6010 5501 5919 5501 5919 5775 6375 5775 6375 5501
    15          6284 5501 6284 5410
    16 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
    17          6102 5410 6102 5501 6192 5501 6192 5410
    18 -6
    19 6 7442 6525 7875 6900
     105 1 0 1 0 7 50 -1 -1 0.000 0 1 1 1 3376.136 2169.318 2250 2625 2775 3225 3525 3375
     11        1 1 1.00 60.00 120.00
     12        7 1 1.00 60.00 60.00
     136 3466 2774 3899 3149
    20142 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    21          7501 6584 7442 6900
     15         3525 2833 3466 3149
    22162 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    23          7856 6584 7836 6703
     17         3880 2833 3860 2952
    24183 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    25          7481 6703 7599 6663 7737 6722 7836 6703
     19         3505 2952 3623 2912 3761 2971 3860 2952
    2620         0.000 -0.500 -0.500 0.000
    27213 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    28          7503 6579 7621 6540 7759 6599 7857 6579
     22         3527 2828 3645 2789 3783 2848 3881 2828
    2923         0.000 -0.500 -0.500 0.000
    3024-6
    31 6 7575 6825 7950 7325
     256 3599 3074 3974 3574
    32262 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    33          7575 6950 7700 6825 7950 6825 7950 7325 7575 7325 7575 6950
    34          7700 6950 7700 6825
     27         3599 3199 3724 3074 3974 3074 3974 3574 3599 3574 3599 3199
     28         3724 3199 3724 3074
    3529-6
    36 6 9092 6525 9525 6900
     306 5116 2774 5549 3149
    37312 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    38          9151 6584 9092 6900
     32         5175 2833 5116 3149
    39332 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    40          9506 6584 9486 6703
     34         5530 2833 5510 2952
    41353 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    42          9131 6703 9249 6663 9387 6722 9486 6703
     36         5155 2952 5273 2912 5411 2971 5510 2952
    4337         0.000 -0.500 -0.500 0.000
    44383 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    45          9153 6579 9271 6540 9409 6599 9507 6579
     39         5177 2828 5295 2789 5433 2848 5531 2828
    4640         0.000 -0.500 -0.500 0.000
    4741-6
    48 6 9225 6825 9600 7325
     426 5249 3074 5625 3574
    49432 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    50          9225 6950 9350 6825 9600 6825 9600 7325 9225 7325 9225 6950
    51          9350 6950 9350 6825
     44         5249 3199 5374 3074 5625 3074 5625 3574 5249 3574 5249 3199
     45         5374 3199 5374 3074
    5246-6
    53 6 10742 6525 11175 6900
     476 6766 2774 7199 3149
    54482 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    55          10801 6584 10742 6900
     49         6825 2833 6766 3149
    56502 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    57          11156 6584 11136 6703
     51         7180 2833 7160 2952
    58523 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    59          10781 6703 10899 6663 11037 6722 11136 6703
     53         6805 2952 6923 2912 7061 2971 7160 2952
    6054         0.000 -0.500 -0.500 0.000
    61553 2 0 1 0 7 50 -1 -1 0.000 0 0 0 4
    62          10803 6579 10921 6540 11059 6599 11157 6579
     56         6827 2828 6945 2789 7083 2848 7181 2828
    6357         0.000 -0.500 -0.500 0.000
    6458-6
    65 6 10875 6825 11250 7325
     596 6899 3074 7274 3574
    66602 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    67          10875 6950 11000 6825 11250 6825 11250 7325 10875 7325 10875 6950
    68          11000 6950 11000 6825
     61         6899 3199 7024 3074 7274 3074 7274 3574 6899 3574 6899 3199
     62         7024 3199 7024 3074
     63-6
     646 1875 1500 2331 2025
     655 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1660.011 2058 1660 2103 1614 2148 1660
     665 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1661.000 1966 1660 2103 1523 2240 1660
     672 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
     68         1966 1660 1966 1751 1875 1751 1875 2025 2331 2025 2331 1751
     69         2240 1751 2240 1660
     702 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
     71         2058 1660 2058 1751 2148 1751 2148 1660
    6972-6
    70732 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    71          5850 6150 6675 6150
    72 2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    73          5850 5250 6675 5250 6675 6600 5850 6600 5850 5250
    74 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    75         1 1 1.00 60.00 120.00
    76         7 0 1.00 60.00 60.00
    77          7725 6150 7725 6525
    78 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    79         1 1 1.00 60.00 120.00
    80         7 0 1.00 60.00 60.00
    81          9375 6150 9375 6525
    82 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    83         1 1 1.00 60.00 120.00
    84         7 0 1.00 60.00 60.00
    85          11025 6150 11025 6525
    86 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    87          10500 5854 10763 6308 11288 6308 11550 5854 11288 5400 10763 5400
    88          10500 5854
    89 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    90          8850 5854 9113 6308 9638 6308 9900 5854 9638 5400 9113 5400
    91          8850 5854
    92 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    93          7200 5854 7463 6308 7988 6308 8250 5854 7988 5400 7463 5400
    94          7200 5854
     74         1800 2400 2699 2399
    95752 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    9676        1 1 1.00 60.00 120.00
    9777        7 1 1.00 60.00 60.00
    98          6450 5925 7275 5925
     78         3749 2399 3749 2774
    99792 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    10080        1 1 1.00 60.00 120.00
    10181        7 1 1.00 60.00 60.00
    102          8025 5925 8925 5925
     82         5399 2399 5399 2774
    103832 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    10484        1 1 1.00 60.00 120.00
    10585        7 1 1.00 60.00 60.00
    106          9675 5925 10575 5925
     86         2550 2175 3299 2174
    107872 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    10888        1 1 1.00 60.00 120.00
    10989        7 1 1.00 60.00 60.00
    110          10725 5775 9825 5775
     90         4049 2174 4949 2174
    111912 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    11292        1 1 1.00 60.00 120.00
    11393        7 1 1.00 60.00 60.00
    114          9075 5775 8175 5775
    115 3 2 0 1 0 7 50 -1 -1 0.000 0 1 1 4
     94         5699 2174 6599 2174
     952 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    11696        1 1 1.00 60.00 120.00
    11797        7 1 1.00 60.00 60.00
    118          6300 6375 6375 6825 6750 7050 7350 6975
    119          0.000 -0.500 -0.500 0.000
    120 4 0 0 50 -1 0 11 0.0000 2 135 810 5925 5175 Idle List\001
    121 4 0 0 50 -1 0 11 0.0000 2 135 810 5175 5550 Idle List\001
    122 4 0 0 50 -1 0 11 0.0000 2 135 360 5325 5700 Lock\001
    123 4 0 0 50 -1 0 11 0.0000 2 135 540 5775 6900 Atomic\001
    124 4 0 0 50 -1 0 11 0.0000 2 135 630 5775 7125 Pointer\001
    125 4 0 0 50 -1 0 11 0.0000 2 165 810 7950 6675 Benaphore\001
    126 4 0 0 50 -1 0 11 0.0000 2 135 720 8025 7125 Event FD\001
    127 4 0 0 50 -1 0 11 0.0000 2 135 1260 7275 5325 Idle Processor\001
    128 4 0 0 50 -1 0 11 0.0000 2 165 810 9600 6675 Benaphore\001
    129 4 0 0 50 -1 0 11 0.0000 2 135 720 9675 7125 Event FD\001
    130 4 0 0 50 -1 0 11 0.0000 2 135 1260 8925 5325 Idle Processor\001
    131 4 0 0 50 -1 0 11 0.0000 2 165 810 11250 6675 Benaphore\001
    132 4 0 0 50 -1 0 11 0.0000 2 135 720 11325 7125 Event FD\001
    133 4 0 0 50 -1 0 11 0.0000 2 135 1260 10575 5325 Idle Processor\001
     98         6749 2024 5849 2024
     992 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     100        1 1 1.00 60.00 120.00
     101        7 1 1.00 60.00 60.00
     102         5099 2024 4199 2024
     1032 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     104         1800 1499 2699 1499 2699 2850 1800 2850 1800 1499
     1052 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     106         4950 1650 5850 1650 5850 2550 4950 2550 4950 1650
     1072 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     108         3300 1650 4200 1650 4200 2550 3300 2550 3300 1650
     1092 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     110         6600 1650 7500 1650 7500 2550 6600 2550 6600 1650
     1112 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     112        1 1 1.00 60.00 120.00
     113        7 1 1.00 60.00 60.00
     114         7049 2399 7049 2774
     1154 0 0 50 -1 0 11 0.0000 2 120 525 1799 3149 Atomic\001
     1164 0 0 50 -1 0 11 0.0000 2 120 510 1799 3374 Pointer\001
     1174 0 0 50 -1 0 11 0.0000 2 180 765 3974 2924 Benaphore\001
     1184 0 0 50 -1 0 11 0.0000 2 120 690 4049 3374 Event FD\001
     1194 0 0 50 -1 0 11 0.0000 2 180 765 5625 2924 Benaphore\001
     1204 0 0 50 -1 0 11 0.0000 2 120 690 5699 3374 Event FD\001
     1214 0 0 50 -1 0 11 0.0000 2 180 765 7274 2924 Benaphore\001
     1224 0 0 50 -1 0 11 0.0000 2 120 690 7349 3374 Event FD\001
     1234 2 0 50 -1 0 11 0.0000 2 135 585 1725 1800 Idle List\001
     1244 2 0 50 -1 0 11 0.0000 2 135 360 1725 1950 Lock\001
     1254 1 0 50 -1 0 11 0.0000 2 135 585 2250 1425 Idle List\001
     1264 1 0 50 -1 0 11 0.0000 2 135 1020 3750 1575 Idle Processor\001
     1274 1 0 50 -1 0 11 0.0000 2 135 1020 5400 1575 Idle Processor\001
     1284 1 0 50 -1 0 11 0.0000 2 135 1020 7050 1575 Idle Processor\001
  • doc/theses/thierry_delisle_PhD/thesis/fig/idle1.fig

    r2fd0de0 r6726a3a  
    88-2
    991200 2
    10 6 5919 5250 6375 5775
    11 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5409.011 6102 5410 6147 5364 6192 5410
    12 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5410.000 6010 5410 6147 5273 6284 5410
     106 1875 1500 2331 2025
     115 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1660.011 2058 1660 2103 1614 2148 1660
     125 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1661.000 1966 1660 2103 1523 2240 1660
    13132 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    14          6010 5410 6010 5501 5919 5501 5919 5775 6375 5775 6375 5501
    15          6284 5501 6284 5410
     14         1966 1660 1966 1751 1875 1751 1875 2025 2331 2025 2331 1751
     15         2240 1751 2240 1660
    16162 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
    17          6102 5410 6102 5501 6192 5501 6192 5410
     17         2058 1660 2058 1751 2148 1751 2148 1660
    1818-6
    19 6 7575 6525 7950 7025
     196 3599 2774 3974 3274
    20202 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    21          7575 6650 7700 6525 7950 6525 7950 7025 7575 7025 7575 6650
    22          7700 6650 7700 6525
     21         3599 2899 3724 2774 3974 2774 3974 3274 3599 3274 3599 2899
     22         3724 2899 3724 2774
    2323-6
    24 6 9225 6525 9600 7025
     246 5249 2774 5625 3274
    25252 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    26          9225 6650 9350 6525 9600 6525 9600 7025 9225 7025 9225 6650
    27          9350 6650 9350 6525
     26         5249 2899 5374 2774 5625 2774 5625 3274 5249 3274 5249 2899
     27         5374 2899 5374 2774
    2828-6
    29 6 10875 6525 11250 7025
     296 6899 2774 7274 3274
    30302 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    31          10875 6650 11000 6525 11250 6525 11250 7025 10875 7025 10875 6650
    32          11000 6650 11000 6525
     31         6899 2899 7024 2774 7274 2774 7274 3274 6899 3274 6899 2899
     32         7024 2899 7024 2774
    3333-6
    34342 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    3535        1 1 1.00 60.00 120.00
    36         7 0 1.00 60.00 60.00
    37          7725 6150 7725 6525
    38 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    39         1 1 1.00 60.00 120.00
    40         7 0 1.00 60.00 60.00
    41          9375 6150 9375 6525
    42 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    43         1 1 1.00 60.00 120.00
    44         7 0 1.00 60.00 60.00
    45          11025 6150 11025 6525
    46 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    47          10500 5854 10763 6308 11288 6308 11550 5854 11288 5400 10763 5400
    48          10500 5854
    49 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    50          8850 5854 9113 6308 9638 6308 9900 5854 9638 5400 9113 5400
    51          8850 5854
    52 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    53          7200 5854 7463 6308 7988 6308 8250 5854 7988 5400 7463 5400
    54          7200 5854
     36        7 1 1.00 60.00 60.00
     37         3749 2399 3749 2774
    55382 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    5639        1 1 1.00 60.00 120.00
    5740        7 1 1.00 60.00 60.00
    58          6450 5925 7275 5925
     41         5399 2399 5399 2774
    59422 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6043        1 1 1.00 60.00 120.00
    6144        7 1 1.00 60.00 60.00
    62          8025 5925 8925 5925
     45         7049 2399 7049 2774
    63462 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6447        1 1 1.00 60.00 120.00
    6548        7 1 1.00 60.00 60.00
    66          9675 5925 10575 5925
     49         2550 2175 3299 2174
    67502 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6851        1 1 1.00 60.00 120.00
    6952        7 1 1.00 60.00 60.00
    70          10725 5775 9825 5775
     53         4049 2174 4949 2174
    71542 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    7255        1 1 1.00 60.00 120.00
    7356        7 1 1.00 60.00 60.00
    74          9075 5775 8175 5775
     57         5699 2174 6599 2174
     582 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     59        1 1 1.00 60.00 120.00
     60        7 1 1.00 60.00 60.00
     61         6749 2024 5849 2024
     622 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     63        1 1 1.00 60.00 120.00
     64        7 1 1.00 60.00 60.00
     65         5099 2024 4199 2024
    75662 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    76          5850 5250 6675 5250 6675 6075 5850 6075 5850 5250
    77 4 0 0 50 -1 0 11 0.0000 2 135 810 5925 5175 Idle List\001
    78 4 0 0 50 -1 0 11 0.0000 2 135 810 5175 5550 Idle List\001
    79 4 0 0 50 -1 0 11 0.0000 2 135 360 5325 5700 Lock\001
    80 4 0 0 50 -1 0 11 0.0000 2 135 1260 7275 5325 Idle Processor\001
    81 4 0 0 50 -1 0 11 0.0000 2 135 1260 8925 5325 Idle Processor\001
    82 4 0 0 50 -1 0 11 0.0000 2 135 1260 10575 5325 Idle Processor\001
    83 4 0 0 50 -1 0 11 0.0000 2 135 720 8025 6825 Event FD\001
    84 4 0 0 50 -1 0 11 0.0000 2 135 720 9675 6825 Event FD\001
    85 4 0 0 50 -1 0 11 0.0000 2 135 720 11325 6825 Event FD\001
     67         4950 1650 5850 1650 5850 2550 4950 2550 4950 1650
     682 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     69         3300 1650 4200 1650 4200 2550 3300 2550 3300 1650
     702 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     71         6600 1650 7500 1650 7500 2550 6600 2550 6600 1650
     722 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     73         1800 1499 2699 1499 2699 2400 1800 2400 1800 1499
     744 2 0 50 -1 0 11 0.0000 2 135 585 1725 1800 Idle List\001
     754 2 0 50 -1 0 11 0.0000 2 135 360 1725 1950 Lock\001
     764 1 0 50 -1 0 11 0.0000 2 135 585 2250 1425 Idle List\001
     774 1 0 50 -1 0 11 0.0000 2 135 1020 3750 1575 Idle Processor\001
     784 1 0 50 -1 0 11 0.0000 2 135 1020 5400 1575 Idle Processor\001
     794 1 0 50 -1 0 11 0.0000 2 135 1020 7050 1575 Idle Processor\001
     804 0 0 50 -1 0 11 0.0000 2 120 690 4049 3074 Event FD\001
     814 0 0 50 -1 0 11 0.0000 2 120 690 5699 3074 Event FD\001
     824 0 0 50 -1 0 11 0.0000 2 120 690 7349 3074 Event FD\001
  • doc/theses/thierry_delisle_PhD/thesis/fig/idle2.fig

    r2fd0de0 r6726a3a  
    88-2
    991200 2
    10 6 5919 5250 6375 5775
    11 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5409.011 6102 5410 6147 5364 6192 5410
    12 5 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 6147.000 5410.000 6010 5410 6147 5273 6284 5410
     105 1 0 1 0 7 50 -1 -1 0.000 0 1 1 1 3150.000 2106.250 2250 2625 2775 3075 3525 3075
     11        1 1 1.00 60.00 120.00
     12        7 1 1.00 60.00 60.00
     136 1875 1500 2331 2025
     145 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1660.011 2058 1660 2103 1614 2148 1660
     155 1 0 1 0 7 50 -1 -1 0.000 0 0 0 0 2104.000 1661.000 1966 1660 2103 1523 2240 1660
    13162 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    14          6010 5410 6010 5501 5919 5501 5919 5775 6375 5775 6375 5501
    15          6284 5501 6284 5410
     17         1966 1660 1966 1751 1875 1751 1875 2025 2331 2025 2331 1751
     18         2240 1751 2240 1660
    16192 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
    17          6102 5410 6102 5501 6192 5501 6192 5410
     20         2058 1660 2058 1751 2148 1751 2148 1660
    1821-6
    19 6 7575 6525 7950 7025
     226 3599 2774 3974 3274
    20232 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    21          7575 6650 7700 6525 7950 6525 7950 7025 7575 7025 7575 6650
    22          7700 6650 7700 6525
     24         3599 2899 3724 2774 3974 2774 3974 3274 3599 3274 3599 2899
     25         3724 2899 3724 2774
    2326-6
    24 6 9225 6525 9600 7025
     276 5249 2774 5625 3274
    25282 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    26          9225 6650 9350 6525 9600 6525 9600 7025 9225 7025 9225 6650
    27          9350 6650 9350 6525
     29         5249 2899 5374 2774 5625 2774 5625 3274 5249 3274 5249 2899
     30         5374 2899 5374 2774
    2831-6
    29 6 10875 6525 11250 7025
     326 6899 2774 7274 3274
    30332 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8
    31          10875 6650 11000 6525 11250 6525 11250 7025 10875 7025 10875 6650
    32          11000 6650 11000 6525
     34         6899 2899 7024 2774 7274 2774 7274 3274 6899 3274 6899 2899
     35         7024 2899 7024 2774
    3336-6
    34372 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
    35          5850 6150 6675 6150
    36 2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    37          5850 5250 6675 5250 6675 6600 5850 6600 5850 5250
    38 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    39         1 1 1.00 60.00 120.00
    40         7 0 1.00 60.00 60.00
    41          7725 6150 7725 6525
    42 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    43         1 1 1.00 60.00 120.00
    44         7 0 1.00 60.00 60.00
    45          9375 6150 9375 6525
    46 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    47         1 1 1.00 60.00 120.00
    48         7 0 1.00 60.00 60.00
    49          11025 6150 11025 6525
    50 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    51          10500 5854 10763 6308 11288 6308 11550 5854 11288 5400 10763 5400
    52          10500 5854
    53 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    54          8850 5854 9113 6308 9638 6308 9900 5854 9638 5400 9113 5400
    55          8850 5854
    56 2 3 0 1 0 7 50 -1 -1 0.000 0 0 0 0 0 7
    57          7200 5854 7463 6308 7988 6308 8250 5854 7988 5400 7463 5400
    58          7200 5854
     38         1800 2400 2699 2399
    59392 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6040        1 1 1.00 60.00 120.00
    6141        7 1 1.00 60.00 60.00
    62          6450 5925 7275 5925
     42         3749 2399 3749 2774
    63432 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6444        1 1 1.00 60.00 120.00
    6545        7 1 1.00 60.00 60.00
    66          8025 5925 8925 5925
     46         5399 2399 5399 2774
    67472 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    6848        1 1 1.00 60.00 120.00
    6949        7 1 1.00 60.00 60.00
    70          9675 5925 10575 5925
     50         7049 2399 7049 2774
    71512 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    7252        1 1 1.00 60.00 120.00
    7353        7 1 1.00 60.00 60.00
    74          10725 5775 9825 5775
     54         2550 2175 3299 2174
    75552 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    7656        1 1 1.00 60.00 120.00
    7757        7 1 1.00 60.00 60.00
    78          9075 5775 8175 5775
    79 3 2 0 1 0 7 50 -1 -1 0.000 0 1 1 4
     58         4049 2174 4949 2174
     592 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    8060        1 1 1.00 60.00 120.00
    8161        7 1 1.00 60.00 60.00
    82          6300 6375 6375 6825 6900 6975 7500 6750
    83          0.000 -0.500 -0.500 0.000
    84 4 0 0 50 -1 0 11 0.0000 2 135 810 5925 5175 Idle List\001
    85 4 0 0 50 -1 0 11 0.0000 2 135 810 5175 5550 Idle List\001
    86 4 0 0 50 -1 0 11 0.0000 2 135 360 5325 5700 Lock\001
    87 4 0 0 50 -1 0 11 0.0000 2 135 540 5775 6900 Atomic\001
    88 4 0 0 50 -1 0 11 0.0000 2 135 630 5775 7125 Pointer\001
    89 4 0 0 50 -1 0 11 0.0000 2 135 1260 7275 5325 Idle Processor\001
    90 4 0 0 50 -1 0 11 0.0000 2 135 1260 8925 5325 Idle Processor\001
    91 4 0 0 50 -1 0 11 0.0000 2 135 1260 10575 5325 Idle Processor\001
    92 4 0 0 50 -1 0 11 0.0000 2 135 720 8025 6825 Event FD\001
    93 4 0 0 50 -1 0 11 0.0000 2 135 720 9675 6825 Event FD\001
    94 4 0 0 50 -1 0 11 0.0000 2 135 720 11325 6825 Event FD\001
     62         5699 2174 6599 2174
     632 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     64        1 1 1.00 60.00 120.00
     65        7 1 1.00 60.00 60.00
     66         6749 2024 5849 2024
     672 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
     68        1 1 1.00 60.00 120.00
     69        7 1 1.00 60.00 60.00
     70         5099 2024 4199 2024
     712 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     72         1800 1499 2699 1499 2699 2850 1800 2850 1800 1499
     732 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     74         4950 1650 5850 1650 5850 2550 4950 2550 4950 1650
     752 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     76         3300 1650 4200 1650 4200 2550 3300 2550 3300 1650
     772 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     78         6600 1650 7500 1650 7500 2550 6600 2550 6600 1650
     794 0 0 50 -1 0 11 0.0000 2 120 525 1799 3149 Atomic\001
     804 0 0 50 -1 0 11 0.0000 2 120 510 1799 3374 Pointer\001
     814 2 0 50 -1 0 11 0.0000 2 135 585 1725 1800 Idle List\001
     824 2 0 50 -1 0 11 0.0000 2 135 360 1725 1950 Lock\001
     834 1 0 50 -1 0 11 0.0000 2 135 585 2250 1425 Idle List\001
     844 1 0 50 -1 0 11 0.0000 2 135 1020 3750 1575 Idle Processor\001
     854 1 0 50 -1 0 11 0.0000 2 135 1020 5400 1575 Idle Processor\001
     864 1 0 50 -1 0 11 0.0000 2 135 1020 7050 1575 Idle Processor\001
     874 0 0 50 -1 0 11 0.0000 2 120 690 4049 3074 Event FD\001
     884 0 0 50 -1 0 11 0.0000 2 120 690 5699 3074 Event FD\001
     894 0 0 50 -1 0 11 0.0000 2 120 690 7349 3074 Event FD\001
  • doc/theses/thierry_delisle_PhD/thesis/fig/idle_state.fig

    r2fd0de0 r6726a3a  
    88-2
    991200 2
    10 1 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 3900 3600 571 571 3900 3600 3375 3375
    11 1 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 6300 3600 605 605 6300 3600 5775 3300
    12 1 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 5100 5400 600 600 5100 5400 4500 5400
     101 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 3000 3600 600 600 3000 3600 2400 3600
     111 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 1800 1800 600 600 1800 1800 1200 1800
     121 3 0 1 0 7 50 -1 -1 0.000 1 0.0000 4205 1800 600 600 4205 1800 3605 1800
    13132 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
    14         0 0 1.00 60.00 120.00
    15          4200 4125 4725 4950
     14        1 1 1.00 60.00 120.00
     15         2100 2325 2625 3150
    16162 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
    17         0 0 1.00 60.00 120.00
    18          4500 3600 5700 3600
     17        1 1 1.00 60.00 120.00
     18         2400 1800 3600 1800
    19192 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
    20         0 0 1.00 60.00 120.00
    21          5923 4125 5475 4875
    22 4 1 0 50 -1 0 11 0.0000 2 135 450 5100 5475 AWAKE\001
    23 4 1 0 50 -1 0 11 0.0000 2 135 450 6300 3675 SLEEP\001
    24 4 1 0 50 -1 0 11 0.0000 2 135 540 3900 3675 SEARCH\001
    25 4 0 0 50 -1 0 11 0.0000 2 135 360 5775 4650 WAKE\001
    26 4 2 0 50 -1 0 11 0.0000 2 135 540 4350 4650 CANCEL\001
    27 4 1 0 50 -1 0 11 0.0000 2 135 630 5025 3450 CONFIRM\001
     20        1 1 1.00 60.00 120.00
     21         3900 2325 3375 3150
     224 1 0 50 -1 0 11 0.0000 2 120 675 3000 3675 AWAKE\001
     234 1 0 50 -1 0 11 0.0000 2 120 525 4200 1875 SLEEP\001
     244 1 0 50 -1 0 11 0.0000 2 120 720 1800 1875 SEARCH\001
     254 2 0 50 -1 0 11 0.0000 2 120 720 2250 2850 CANCEL\001
     264 1 0 50 -1 0 11 0.0000 2 120 840 2925 1650 CONFIRM\001
     274 0 0 50 -1 0 11 0.0000 2 120 540 3750 2850 WAKE\001
  • doc/theses/thierry_delisle_PhD/thesis/local.bib

    r2fd0de0 r6726a3a  
    499499}
    500500
    501 @article{MAN:linux/cfs/balancing,
     501@misc{MAN:linux/cfs/balancing,
    502502  title={Reworking {CFS} load balancing},
    503   journal={LWN article, available at: https://lwn.net/Articles/793427/},
    504   year={2013}
     503  journal={LWN article},
     504  year={2019},
     505  howpublished = {\href{https://lwn.net/Articles/793427}{https://\-lwn.net/\-Articles/\-793427}},
    505506}
    506507
     
    539540}
    540541
    541 @online{GITHUB:go,
     542@misc{GITHUB:go,
    542543  title = {GitHub - The Go Programming Language},
    543544  author = {The Go Programming Language},
     
    561562  howpublished = {\href{http://www.erlang.se/euc/08/euc_smp.pdf}{http://\-www.erlang.se/\-euc/\-08/\-euc_smp.pdf}}
    562563}
    563 
    564 
    565564
    566565@manual{MAN:tbb/scheduler,
     
    701700  note = "[Online; accessed 12-April-2022]"
    702701}
     702
    703703@misc{wiki:binpak,
    704704  author = "{Wikipedia contributors}",
     
    754754}
    755755
     756@inproceedings{Albers12,
     757    author      = {Susanne Albers and Antonios Antoniadis},
     758    title       = {Race to Idle: New Algorithms for Speed Scaling with a Sleep State},
     759    booktitle   = {Proceedings of the 2012  Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)},
     760    doi         = {10.1137/1.9781611973099.100},
     761    URL         = {https://epubs.siam.org/doi/abs/10.1137/1.9781611973099.100},
     762    eprint      = {https://epubs.siam.org/doi/pdf/10.1137/1.9781611973099.100},
     763    year        = 2012,
     764    month       = jan,
     765    pages       = {1266-1285},
     766}
  • doc/theses/thierry_delisle_PhD/thesis/text/core.tex

    r2fd0de0 r6726a3a  
    341341
    342342\subsection{Topological Work Stealing}
     343\label{s:TopologicalWorkStealing}
    343344Therefore, the approach used in the \CFA scheduler is to have per-\proc subqueues, but have an explicit data-structure track which cache substructure each subqueue is tied to.
    344345This tracking requires some finesse because reading this data structure must lead to fewer cache misses than not having the data structure in the first place.
  • doc/theses/thierry_delisle_PhD/thesis/text/io.tex

    r2fd0de0 r6726a3a  
    250250In this design, allocation and submission form a partitioned ring buffer as shown in Figure~\ref{fig:pring}.
    251251Once added to the ring buffer, the attached \gls{proc} has a significant amount of flexibility with regards to when to perform the system call.
    252 Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of \glspl{thrd}, etc.
     252Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of \glspl{thrd}, \etc.
    253253
    254254\begin{figure}
  • doc/theses/thierry_delisle_PhD/thesis/text/practice.tex

    r2fd0de0 r6726a3a  
    11\chapter{Scheduling in practice}\label{practice}
    2 The scheduling algorithm discribed in Chapter~\ref{core} addresses scheduling in a stable state.
    3 However, it does not address problems that occur when the system changes state.
     2The scheduling algorithm described in Chapter~\ref{core} addresses scheduling in a stable state.
     3This chapter addresses problems that occur when the system state changes.
    44Indeed the \CFA runtime, supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.
    5 This entails that the scheduling algorithm must support these transitions.
    6 
    7 More precise \CFA supports adding \procs using the RAII object @processor@.
    8 These objects can be created at any time and can be destroyed at any time.
    9 They are normally created as automatic stack variables, but this is not a requirement.
    10 
    11 The consequence is that the scheduler and \io subsystems must support \procs comming in and out of existence.
     5These changes affect the scheduling algorithm, which must dynamically alter its behaviour.
     6
     7In detail, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.
     8\begin{lstlisting}
     9{
     10        processor p[4]; // 4 new kernel threads
     11        ... // execute on 4 processors
     12        processor * dp = new( processor, 6 ); // 6 new kernel threads
     13        ... // execute on 10 processors
     14        delete( dp );   // delete 6 kernel threads
     15        ... // execute on 4 processors
     16} // delete 4 kernel threads
     17\end{lstlisting}
     18Dynamically allocated processors can be deleted an any time, \ie their lifetime exceeds the block of creation.
     19The consequence is that the scheduler and \io subsystems must know when these \procs come in and out of existence and roll them into the appropriate scheduling algorithms.
    1220
    1321\section{Manual Resizing}
    1422Manual resizing is expected to be a rare operation.
    15 Programmers are mostly expected to resize clusters on startup or teardown.
    16 Therefore dynamically changing the number of \procs is an appropriate moment to allocate or free resources to match the new state.
    17 As such all internal arrays that are sized based on the number of \procs need to be @realloc@ed.
    18 This also means that any references into these arrays, pointers or indexes, may need to be fixed when shrinking\footnote{Indexes may still need fixing when shrinkingbecause some indexes are expected to refer to dense contiguous resources and there is no guarantee the resource being removed has the highest index.}.
     23Programmers normally create/delete processors on a clusters at startup/teardown.
     24Therefore, dynamically changing the number of \procs is an appropriate moment to allocate or free resources to match the new state.
     25As such, all internal scheduling arrays that are sized based on the number of \procs need to be @realloc@ed.
     26This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or any other reason.
     27% \footnote{Indexes may still need fixing when shrinking because some indexes are expected to refer to dense contiguous resources and there is no guarantee the resource being removed has the highest index.}
    1928
    2029There are no performance requirements, within reason, for resizing since it is expected to be rare.
    21 However, this operation has strict correctness requirements since shrinking and idle sleep can easily lead to deadlocks.
     30However, this operation has strict correctness requirements since updating and idle sleep can easily lead to deadlocks.
    2231It should also avoid as much as possible any effect on performance when the number of \procs remain constant.
    2332This later requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.
    2433
    2534\subsection{Read-Copy-Update}
    26 One solution is to use the Read-Copy-Update\cite{wiki:rcu} pattern.
    27 In this pattern, resizing is done by creating a copy of the internal data strucures, updating the copy with the desired changes, and then attempt an Idiana Jones Switch to replace the original witht the copy.
    28 This approach potentially has the advantage that it may not need any synchronization to do the switch.
    29 However, there is a race where \procs could still use the previous, original, data structure after the copy was switched in.
    30 This race not only requires some added memory reclamation scheme, it also requires that operations made on the stale original version be eventually moved to the copy.
    31 
    32 For linked-lists, enqueing is only somewhat problematic, \ats enqueued to the original queues need to be transferred to the new, which might not preserve ordering.
    33 Dequeing is more challenging.
    34 Dequeing from the original will not necessarily update the copy which could lead to multiple \procs dequeing the same \at.
    35 Fixing this requires more synchronization or more indirection on the queues.
    36 
    37 Another challenge is that the original must be kept until all \procs have witnessed the change.
    38 This is a straight forward memory reclamation challenge but it does mean that every operation will need \emph{some} form of synchronization.
    39 If each of these operation does need synchronization then it is possible a simpler solution achieves the same performance.
    40 Because in addition to the classic challenge of memory reclamation, transferring the original data to the copy before reclaiming it poses additional challenges.
     35One solution is to use the Read-Copy-Update pattern~\cite{wiki:rcu}.
     36In this pattern, resizing is done by creating a copy of the internal data structures (\eg see Figure~\ref{fig:base-ts2}), updating the copy with the desired changes, and then attempt an Indiana Jones Switch to replace the original with the copy.
     37This approach has the advantage that it may not need any synchronization to do the switch.
     38However, there is a race where \procs still use the original data structure after the copy is switched.
     39This race not only requires adding a memory-reclamation scheme, it also requires that operations made on the stale original version are eventually moved to the copy.
     40
     41Specifically, the original data structure must be kept until all \procs have witnessed the change.
     42This requirement is the \newterm{memory reclamation challenge} and means every operation needs \emph{some} form of synchronization.
     43If all operations need synchronization, then the overall cost of this technique is likely to be similar to an uncontended lock approach.
     44In addition to the classic challenge of memory reclamation, transferring the original data to the copy before reclaiming it poses additional challenges.
    4145Especially merging subqueues while having a minimal impact on fairness and locality.
    4246
    43 \subsection{Read-Writer Lock}
    44 A simpler approach would be to use a \newterm{Readers-Writer Lock}\cite{wiki:rwlock} where the resizing requires acquiring the lock as a writer while simply enqueing/dequeing \ats requires acquiring the lock as a reader.
     47For example, given a linked-list, having a node enqueued onto the original and new list is not necessarily a problem depending on the chosen list structure.
     48If the list supports arbitrary insertions, then inconsistencies in the tail pointer do not break the list;
     49however, ordering may not be preserved.
     50Furthermore, nodes enqueued to the original queues eventually need to be uniquely transferred to the new queues, which may further perturb ordering.
     51Dequeuing is more challenging when nodes appear on both lists because of pending reclamation: dequeuing a node from one list does not remove it from the other nor is that node in the same place on the other list.
     52This situation can lead to multiple \procs dequeuing the same \at.
     53Fixing these challenges requires more synchronization or more indirection to the queues, plus coordinated searching to ensure unique elements.
     54
     55\subsection{Readers-Writer Lock}
     56A simpler approach is to use a \newterm{Readers-Writer Lock}~\cite{wiki:rwlock}, where the resizing requires acquiring the lock as a writer while simply enqueueing/dequeuing \ats requires acquiring the lock as a reader.
    4557Using a Readers-Writer lock solves the problem of dynamically resizing and leaves the challenge of finding or building a lock with sufficient good read-side performance.
    46 Since this is not a very complex challenge and an ad-hoc solution is perfectly acceptable, building a Readers-Writer lock was the path taken.
    47 
    48 To maximize reader scalability, the readers should not contend with eachother when attempting to acquire and release the critical sections.
    49 This effectively requires that each reader have its own piece of memory to mark as locked and unlocked.
    50 Reades then acquire the lock wait for writers to finish the critical section and then acquire their local spinlocks.
    51 Writers acquire the global lock, so writers have mutual exclusion among themselves, and then acquires each of the local reader locks.
    52 Acquiring all the local locks guarantees mutual exclusion between the readers and the writer, while the wait on the read side prevents readers from continously starving the writer.
    53 \todo{reference listings}
    54 
     58Since this approach is not a very complex challenge and an ad-hoc solution is perfectly acceptable, building a Readers-Writer lock was the path taken.
     59
     60To maximize reader scalability, readers should not contend with each other when attempting to acquire and release a critical section.
     61To achieve this goal requires each reader to have its own memory to mark as locked and unlocked.
     62The read acquire possibly waits for a writer to finish the critical section and then acquires a reader's local spinlock.
     63The write acquire acquires the global lock, guaranteeing mutual exclusion among writers, and then acquires each of the local reader locks.
     64Acquiring all the local read locks guarantees mutual exclusion among the readers and the writer, while the wait on the read side prevents readers from continuously starving the writer.
     65
     66Figure~\ref{f:SpecializedReadersWriterLock} shows the outline for this specialized readers-writer lock.
     67The lock in nonblocking, so both readers and writers spin while the lock is held.
     68\todo{finish explanation}
     69
     70\begin{figure}
    5571\begin{lstlisting}
    5672void read_lock() {
    5773        // Step 1 : make sure no writers in
    5874        while write_lock { Pause(); }
    59 
    60         // May need fence here
    61 
    6275        // Step 2 : acquire our local lock
    63         while atomic_xchg( tls.lock ) {
    64                 Pause();
    65         }
    66 }
    67 
     76        while atomic_xchg( tls.lock ) { Pause(); }
     77}
    6878void read_unlock() {
    6979        tls.lock = false;
    7080}
    71 \end{lstlisting}
    72 
    73 \begin{lstlisting}
    7481void write_lock()  {
    7582        // Step 1 : lock global lock
    76         while atomic_xchg( write_lock ) {
    77                 Pause();
    78         }
    79 
     83        while atomic_xchg( write_lock ) { Pause(); }
    8084        // Step 2 : lock per-proc locks
    8185        for t in all_tls {
    82                 while atomic_xchg( t.lock ) {
    83                         Pause();
    84                 }
     86                while atomic_xchg( t.lock ) { Pause(); }
    8587        }
    8688}
    87 
    8889void write_unlock() {
    8990        // Step 1 : release local locks
    90         for t in all_tls {
    91                 t.lock = false;
    92         }
    93 
     91        for t in all_tls { t.lock = false; }
    9492        // Step 2 : release global lock
    9593        write_lock = false;
    9694}
    9795\end{lstlisting}
     96\caption{Specialized Readers-Writer Lock}
     97\label{f:SpecializedReadersWriterLock}
     98\end{figure}
    9899
    99100\section{Idle-Sleep}
    100 In addition to users manually changing the number of \procs, it is desireable to support ``removing'' \procs when there is not enough \ats for all the \procs to be useful.
    101 While manual resizing is expected to be rare, the number of \ats is expected to vary much more which means \procs may need to be ``removed'' for only short periods of time.
    102 Furthermore, race conditions that spuriously lead to the impression that no \ats are ready are actually common in practice.
    103 Therefore resources associated with \procs should not be freed but \procs simply put into an idle state where the \gls{kthrd} is blocked until more \ats become ready.
    104 This state is referred to as \newterm{Idle-Sleep}.
     101While manual resizing of \procs is expected to be rare, the number of \ats can vary significantly over an application's lifetime, which means there are times when there are too few or too many \procs.
     102For this work, it is the programer's responsibility to manually create \procs, so if there a too few \procs, the application must address this issue.
     103This leaves too many \procs when there are not enough \ats for all the \procs to be useful.
     104These idle \procs cannot be removed because their lifetime is controlled by the application, and only the application knows when the number of \ats may increase or decrease.
     105While idle \procs can spin until work appears, this approach wastes the processor (from other applications), energy and heat.
     106Therefore, idle \procs are put into an idle state, called \newterm{Idle-Sleep}, where the \gls{kthrd} is blocked until the scheduler deems it is needed.
    105107
    106108Idle sleep effectively encompasses several challenges.
    107 First some data structure needs to keep track of all \procs that are in idle sleep.
    108 Because of idle sleep can be spurious, this data structure has strict performance requirements in addition to the strict correctness requirements.
    109 Next, some tool must be used to block kernel threads \glspl{kthrd}, \eg @pthread_cond_wait@, pthread semaphores.
    110 The complexity here is to support \at parking and unparking, timers, \io operations and all other \CFA features with minimal complexity.
    111 Finally, idle sleep also includes a heuristic to determine the appropriate number of \procs to be in idle sleep an any given time.
    112 This third challenge is however outside the scope of this thesis because developping a general heuristic is involved enough to justify its own work.
    113 The \CFA scheduler simply follows the ``Race-to-Idle'\cit{https://doi.org/10.1137/1.9781611973099.100}' approach where a sleeping \proc is woken any time an \at becomes ready and \procs go to idle sleep anytime they run out of work.
     109First, a data structure needs to keep track of all \procs that are in idle sleep.
     110Because idle sleep is spurious, this data structure has strict performance requirements, in addition to strict correctness requirements.
     111Next, some mechanism is needed to block \glspl{kthrd}, \eg @pthread_cond_wait@ on a pthread semaphore.
     112The complexity here is to support \at parking and unparking, user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.
     113Finally, the scheduler needs a heuristic to determine when to block and unblock an appropriate number of \procs.
     114However, this third challenge is outside the scope of this thesis because developing a general heuristic is complex enough to justify its own work.
     115Therefore, the \CFA scheduler simply follows the ``Race-to-Idle''~\cite{Albers12} approach where a sleeping \proc is woken any time a \at becomes ready and \procs go to idle sleep anytime they run out of work.
    114116
    115117\section{Sleeping}
    116118As usual, the corner-stone of any feature related to the kernel is the choice of system call.
    117 In terms of blocking a \gls{kthrd} until some event occurs the linux kernel has many available options:
    118 
    119 \paragraph{\lstinline{pthread_mutex}/\lstinline{pthread_cond}}
    120 The most classic option is to use some combination of @pthread_mutex@ and @pthread_cond@.
    121 These serve as straight forward mutual exclusion and synchronization tools and allow a \gls{kthrd} to wait on a @pthread_cond@ until signalled.
    122 While this approach is generally perfectly appropriate for \glspl{kthrd} waiting after eachother, \io operations do not signal @pthread_cond@s.
    123 For \io results to wake a \proc waiting on a @pthread_cond@ means that a different \glspl{kthrd} must be woken up first, and then the \proc can be signalled.
     119In terms of blocking a \gls{kthrd} until some event occurs, the Linux kernel has many available options.
     120
     121\subsection{\lstinline{pthread_mutex}/\lstinline{pthread_cond}}
     122The classic option is to use some combination of the pthread mutual exclusion and synchronization locks, allowing a safe park/unpark of a \gls{kthrd} to/from a @pthread_cond@.
     123While this approach works for \glspl{kthrd} waiting among themselves, \io operations do not provide a mechanism to signal @pthread_cond@s.
     124For \io results to wake a \proc waiting on a @pthread_cond@ means a different \glspl{kthrd} must be woken up first, which then signals the \proc.
    124125
    125126\subsection{\lstinline{io_uring} and Epoll}
    126 An alternative is to flip the problem on its head and block waiting for \io, using @io_uring@ or even @epoll@.
    127 This creates the inverse situation, where \io operations directly wake sleeping \procs but waking \proc from a running \gls{kthrd} must use an indirect scheme.
    128 This generally takes the form of creating a file descriptor, \eg, a dummy file, a pipe or an event fd, and using that file descriptor when \procs need to wake eachother.
    129 This leads to additional complexity because there can be a race between these artificial \io operations and genuine \io operations.
    130 If not handled correctly, this can lead to the artificial files going out of sync.
     127An alternative is to flip the problem on its head and block waiting for \io, using @io_uring@ or @epoll@.
     128This creates the inverse situation, where \io operations directly wake sleeping \procs but waking blocked \procs must use an indirect scheme.
     129This generally takes the form of creating a file descriptor, \eg, dummy file, pipe, or event fd, and using that file descriptor when \procs need to wake each other.
     130This leads to additional complexity because there can be a race between these artificial \io and genuine \io operations.
     131If not handled correctly, this can lead to artificial files getting delaying too long behind genuine files, resulting in longer latency.
    131132
    132133\subsection{Event FDs}
    133134Another interesting approach is to use an event file descriptor\cit{eventfd}.
    134 This is a Linux feature that is a file descriptor that behaves like \io, \ie, uses @read@ and @write@, but also behaves like a semaphore.
    135 Indeed, all read and writes must use 64bits large values\footnote{On 64-bit Linux, a 32-bit Linux would use 32 bits values.}.
    136 Writes add their values to the buffer, that is arithmetic addition and not buffer append, and reads zero out the buffer and return the buffer values so far\footnote{
    137 This is without the \lstinline{EFD_SEMAPHORE} flag. This flags changes the behavior of \lstinline{read} but is not needed for this work.}.
     135This Linux feature is a file descriptor that behaves like \io, \ie, uses @read@ and @write@, but also behaves like a semaphore.
     136Indeed, all reads and writes must use a word-sized values, \ie 64 or 32 bits.
     137Writes \emph{add} their values to a buffer using arithmetic addition versus buffer append, and reads zero out the buffer and return the buffer values so far.\footnote{
     138This behaviour is without the \lstinline{EFD_SEMAPHORE} flag, which changes the behaviour of \lstinline{read} but is not needed for this work.}
    138139If a read is made while the buffer is already 0, the read blocks until a non-0 value is added.
    139 What makes this feature particularly interesting is that @io_uring@ supports the @IORING_REGISTER_EVENTFD@ command, to register an event fd to a particular instance.
    140 Once that instance is registered, any \io completion will result in @io\_uring@ writing to the event FD.
    141 This means that a \proc waiting on the event FD can be \emph{directly} woken up by either other \procs or incomming \io.
     140What makes this feature particularly interesting is that @io_uring@ supports the @IORING_REGISTER_EVENTFD@ command to register an event @fd@ to a particular instance.
     141Once that instance is registered, any \io completion results in @io_uring@ writing to the event @fd@.
     142This means that a \proc waiting on the event @fd@ can be \emph{directly} woken up by either other \procs or incoming \io.
     143
     144\section{Tracking Sleepers}
     145Tracking which \procs are in idle sleep requires a data structure holding all the sleeping \procs, but more importantly it requires a concurrent \emph{handshake} so that no \at is stranded on a ready-queue with no active \proc.
     146The classic challenge occurs when a \at is made ready while a \proc is going to sleep: there is a race where the new \at may not see the sleeping \proc and the sleeping \proc may not see the ready \at.
     147Since \ats can be made ready by timers, \io operations, or other events outside a cluster, this race can occur even if the \proc going to sleep is the only \proc awake.
     148As a result, improper handling of this race leads to all \procs going to sleep when there are ready \ats and the system deadlocks.
     149
     150Furthermore, the ``Race-to-Idle'' approach means that there may be contention on the data structure tracking sleepers.
     151Contention can be tolerated for \procs attempting to sleep or wake-up because these \procs are not doing useful work, and therefore, not contributing to overall performance.
     152However, notifying, checking if a \proc must be woken-up, and doing so if needed, can significantly affect overall performance and must be low cost.
     153
     154\subsection{Sleepers List}
     155Each cluster maintains a list of idle \procs, organized as a stack.
     156This ordering allows \procs at the head of the list to stay constantly active and those at the tail to stay in idle sleep for extended period of times.
     157Because of unbalanced performance requirements, the algorithm tracking sleepers is designed to have idle \procs handle as much of the work as possible.
     158The idle \procs maintain the stack of sleepers among themselves and notifying a sleeping \proc takes as little work as possible.
     159This approach means that maintaining the list is fairly straightforward.
     160The list can simply use a single lock per cluster and only \procs that are getting in and out of the idle state contend for that lock.
     161
     162This approach also simplifies notification.
     163Indeed, \procs not only need to be notify when a new \at is readied, but also must be notified during manual resizing, so the \gls{kthrd} can be joined.
     164These requirements mean whichever entity removes idle \procs from the sleeper list must be able to do so in any order.
     165Using a simple lock over this data structure makes the removal much simpler than using a lock-free data structure.
     166The single lock also means the notification process simply needs to wake-up the desired idle \proc, using @pthread_cond_signal@, @write@ on an @fd@, \etc, and the \proc handles the rest.
     167
     168\subsection{Reducing Latency}
     169As mentioned in this section, \procs going to sleep for extremely short periods of time is likely in certain scenarios.
     170Therefore, the latency of doing a system call to read from and writing to an event @fd@ can negatively affect overall performance in a notable way.
     171Hence, it is important to reduce latency and contention of the notification as much as possible.
     172Figure~\ref{fig:idle1} shows the basic idle-sleep data structure.
     173For the notifiers, this data structure can cause contention on the lock and the event @fd@ syscall can cause notable latency.
    142174
    143175\begin{figure}
     
    145177        \input{idle1.pstex_t}
    146178        \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put unto a doubly-linked stack protected by a lock.
    147         Each \proc has a private event FD.}
     179        Each \proc has a private event \lstinline{fd}.}
    148180        \label{fig:idle1}
    149181\end{figure}
    150182
    151 
    152 \section{Tracking Sleepers}
    153 Tracking which \procs are in idle sleep requires a data structure holding all the sleeping \procs, but more importantly it requires a concurrent \emph{handshake} so that no \at is stranded on a ready-queue with no active \proc.
    154 The classic challenge is when a \at is made ready while a \proc is going to sleep, there is a race where the new \at may not see the sleeping \proc and the sleeping \proc may not see the ready \at.
    155 Since \ats can be made ready by timers, \io operations or other events outside a clusre, this race can occur even if the \proc going to sleep is the only \proc awake.
    156 As a result, improper handling of this race can lead to all \procs going to sleep and the system deadlocking.
    157 
    158 Furthermore, the ``Race-to-Idle'' approach means that there may be contention on the data structure tracking sleepers.
    159 Contention slowing down \procs attempting to sleep or wake-up can be tolerated.
    160 These \procs are not doing useful work and therefore not contributing to overall performance.
    161 However, notifying, checking if a \proc must be woken-up and doing so if needed, can significantly affect overall performance and must be low cost.
    162 
    163 \subsection{Sleepers List}
    164 Each cluster maintains a list of idle \procs, organized as a stack.
    165 This ordering hopefully allows \proc at the tail to stay in idle sleep for extended period of times.
    166 Because of these unbalanced performance requirements, the algorithm tracking sleepers is designed to have idle \proc handle as much of the work as possible.
    167 The idle \procs maintain the of sleepers among themselves and notifying a sleeping \proc takes as little work as possible.
    168 This approach means that maintaining the list is fairly straightforward.
    169 The list can simply use a single lock per cluster and only \procs that are getting in and out of idle state will contend for that lock.
    170 
    171 This approach also simplifies notification.
    172 Indeed, \procs need to be notify when a new \at is readied, but they also must be notified during resizing, so the \gls{kthrd} can be joined.
    173 This means that whichever entity removes idle \procs from the sleeper list must be able to do so in any order.
    174 Using a simple lock over this data structure makes the removal much simpler than using a lock-free data structure.
    175 The notification process then simply needs to wake-up the desired idle \proc, using @pthread_cond_signal@, @write@ on an fd, etc., and the \proc will handle the rest.
    176 
    177 \subsection{Reducing Latency}
    178 As mentioned in this section, \procs going idle for extremely short periods of time is likely in certain common scenarios.
    179 Therefore, the latency of doing a system call to read from and writing to the event fd can actually negatively affect overall performance in a notable way.
    180 Is it important to reduce latency and contention of the notification as much as possible.
    181 Figure~\ref{fig:idle1} shoes the basic idle sleep data structure.
    182 For the notifiers, this data structure can cause contention on the lock and the event fd syscall can cause notable latency.
    183 
    184 \begin{figure}
     183Contention occurs because the idle-list lock must be held to access the idle list, \eg by \procs attempting to go to sleep, \procs waking, or notification attempts.
     184The contention from the \procs attempting to go to sleep can be mitigated slightly by using @try_acquire@, so the \procs simply busy wait again searching for \ats if the lock is held.
     185This trick cannot be used when waking \procs since the waker needs to return immediately to what it was doing.
     186Interestingly, general notification, \ie waking any idle processor versus a specific one, does not strictly require modifying the list.
     187Here, contention can be reduced notably by having notifiers avoid the lock entirely by adding a pointer to the event @fd@ of the first idle \proc, as in Figure~\ref{fig:idle2}.
     188To avoid contention among notifiers, notifiers atomically exchange it to @NULL@ so only one notifier contends on the system call.
     189\todo{Expand explanation of how a notification works.}
     190
     191\begin{figure}[t]
    185192        \centering
    186193        \input{idle2.pstex_t}
    187         \caption[Improved Idle Sleep Data Structure]{Improved Idle Sleep Data Structure \smallskip\newline An atomic pointer is added to the list, pointing to the Event FD of the first \proc on the list.}
     194        \caption[Improved Idle-Sleep Data Structure]{Improved Idle-Sleep Data Structure \smallskip\newline An atomic pointer is added to the list pointing to the Event FD of the first \proc on the list.}
    188195        \label{fig:idle2}
    189196\end{figure}
    190197
    191 The contention is mostly due to the lock on the list needing to be held to get to the head \proc.
    192 That lock can be contended by \procs attempting to go to sleep, \procs waking or notification attempts.
    193 The contentention from the \procs attempting to go to sleep can be mitigated slightly by using @try\_acquire@ instead, so the \procs simply continue searching for \ats if the lock is held.
    194 This trick cannot be used for waking \procs since they are not in a state where they can run \ats.
    195 However, it is worth nothing that notification does not strictly require accessing the list or the head \proc.
    196 Therefore, contention can be reduced notably by having notifiers avoid the lock entirely and adding a pointer to the event fd of the first idle \proc, as in Figure~\ref{fig:idle2}.
    197 To avoid contention between the notifiers, instead of simply reading the atomic pointer, notifiers atomically exchange it to @null@ so only only notifier will contend on the system call.
     198The next optimization is to avoid the latency of the event @fd@, which can be done by adding what is effectively a benaphore\cit{benaphore} in front of the event @fd@.
     199A simple three state flag is added beside the event @fd@ to avoid unnecessary system calls, as shown in Figure~\ref{fig:idle:state}.
     200In Topological Work Stealing (see Section~\ref{s:TopologicalWorkStealing}), a \proc without \ats begins searching by setting the state flag to @SEARCH@.
     201If no \ats can be found to steal, the \proc then confirms it is going to sleep by atomically swapping the state to @SLEEP@.
     202If the previous state is still @SEARCH@, then the \proc does read the event @fd@.
     203Meanwhile, notifiers atomically exchange the state to @AWAKE@ state.
     204If the previous state is @SLEEP@, then the notifier must write to the event @fd@.
     205However, if the notify arrives almost immediately after the \proc marks itself sleeping (idle), then both reads and writes on the event @fd@ can be omitted, which reduces latency notably.
     206These extensions leads to the final data structure shown in Figure~\ref{fig:idle}.
     207\todo{You never talk about the Beaphore. What is its purpose and when is it used?}
    198208
    199209\begin{figure}
    200210        \centering
    201211        \input{idle_state.pstex_t}
    202         \caption[Improved Idle Sleep Data Structure]{Improved Idle Sleep Data Structure \smallskip\newline An atomic pointer is added to the list, pointing to the Event FD of the first \proc on the list.}
     212        \caption[Improved Idle-Sleep Latency]{Improved Idle-Sleep Latency \smallskip\newline A three state flag is added to the event \lstinline{fd}.}
    203213        \label{fig:idle:state}
    204214\end{figure}
    205 
    206 The next optimization that can be done is to avoid the latency of the event fd when possible.
    207 This can be done by adding what is effectively a benaphore\cit{benaphore} in front of the event fd.
    208 A simple three state flag is added beside the event fd to avoid unnecessary system calls, as shown in Figure~\ref{fig:idle:state}.
    209 The flag starts in state @SEARCH@, while the \proc is searching for \ats to run.
    210 The \proc then confirms the sleep by atomically swaping the state to @SLEEP@.
    211 If the previous state was still @SEARCH@, then the \proc does read the event fd.
    212 Meanwhile, notifiers atomically exchange the state to @AWAKE@ state.
    213 if the previous state was @SLEEP@, then the notifier must write to the event fd.
    214 However, if the notify arrives almost immediately after the \proc marks itself idle, then both reads and writes on the event fd can be omitted, which reduces latency notably.
    215 This leads to the final data structure shown in Figure~\ref{fig:idle}.
    216215
    217216\begin{figure}
     
    219218        \input{idle.pstex_t}
    220219        \caption[Low-latency Idle Sleep Data Structure]{Low-latency Idle Sleep Data Structure \smallskip\newline Each idle \proc is put unto a doubly-linked stack protected by a lock.
    221         Each \proc has a private event FD with a benaphore in front of it.
    222         The list also has an atomic pointer to the event fd and benaphore of the first \proc on the list.}
     220        Each \proc has a private event \lstinline{fd} with a benaphore in front of it.
     221        The list also has an atomic pointer to the event \lstinline{fd} and benaphore of the first \proc on the list.}
    223222        \label{fig:idle}
    224223\end{figure}
Note: See TracChangeset for help on using the changeset viewer.