Changeset 231b18f for libcfa


Ignore:
Timestamp:
Nov 6, 2020, 7:43:45 AM (4 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
54dcab1
Parents:
3959595
Message:

add documentation describing the race on the ARM processor accessing TLS storage

File:
1 edited

Legend:

Unmodified
Added
Removed
  • libcfa/src/concurrency/preemption.cfa

    r3959595 r231b18f  
    1010// Created On       : Mon Jun 5 14:20:42 2017
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Wed Aug 26 16:46:03 2020
    13 // Update Count     : 53
     12// Last Modified On : Fri Nov  6 07:42:13 2020
     13// Update Count     : 54
    1414//
    1515
     
    163163// Kernel Signal Tools
    164164//=============================================================================================
     165
     166// In a user-level threading system, there are handful of thread-local variables where this problem occurs on the ARM.
     167//
     168// For each kernel thread running user-level threads, there is a flag variable to indicate if interrupts are
     169// enabled/disabled for that kernel thread. Therefore, this variable is made thread local.
     170//
     171// For example, this code fragment sets the state of the "interrupt" variable in thread-local memory.
     172//
     173// _Thread_local volatile int interrupts;
     174// int main() {
     175//     interrupts = 0; // disable interrupts }
     176//
     177// which generates the following code on the ARM
     178//
     179// (gdb) disassemble main
     180// Dump of assembler code for function main:
     181//    0x0000000000000610 <+0>:  mrs     x1, tpidr_el0
     182//    0x0000000000000614 <+4>:  mov     w0, #0x0                        // #0
     183//    0x0000000000000618 <+8>:  add     x1, x1, #0x0, lsl #12
     184//    0x000000000000061c <+12>: add     x1, x1, #0x10
     185//    0x0000000000000620 <+16>: str     wzr, [x1]
     186//    0x0000000000000624 <+20>: ret
     187//
     188// The mrs moves a pointer from coprocessor register tpidr_el0 into register x1.  Register w0 is set to 0. The two adds
     189// increase the TLS pointer with the displacement (offset) 0x10, which is the location in the TSL of variable
     190// "interrupts".  Finally, 0 is stored into "interrupts" through the pointer in register x1 that points into the
     191// TSL. Now once x1 has the pointer to the location of the TSL for kernel thread N, it can be be preempted at a
     192// user-level and the user thread is put on the user-level ready-queue. When the preempted thread gets to the front of
     193// the user-level ready-queue it is run on kernel thread M. It now stores 0 into "interrupts" back on kernel thread N,
     194// turning off interrupt on the wrong kernel thread.
     195//
     196// On the x86, the following code is generated for the same code fragment.
     197//
     198// (gdb) disassemble main
     199// Dump of assembler code for function main:
     200//    0x0000000000400420 <+0>:  movl   $0x0,%fs:0xfffffffffffffffc
     201//    0x000000000040042c <+12>: xor    %eax,%eax
     202//    0x000000000040042e <+14>: retq   
     203//
     204// and there is base-displacement addressing used to atomically reset variable "interrupts" off of the TSL pointer in
     205// register "fs".
     206//
     207// Hence, the ARM has base-displacement address for the general purpose registers, BUT not to the coprocessor
     208// registers. As a result, generating the address for the write into variable "interrupts" is no longer atomic.
     209//
     210// Note this problem does NOT occur when just using multiple kernel threads because the preemption ALWAYS restarts the
     211// thread on the same kernel thread.
     212//
     213// The obvious question is why does ARM use a coprocessor register to store the TSL pointer given that coprocessor
     214// registers are second-class registers with respect to the instruction set. One possible answer is that they did not
     215// want to dedicate one of the general registers to hold the TLS pointer and there was a free coprocessor register
     216// available.
    165217
    166218__cfaabi_dbg_debug_do( static thread_local void * last_interrupt = 0; )
Note: See TracChangeset for help on using the changeset viewer.