aboutsummaryrefslogtreecommitdiff
path: root/hw/intc/xive.c
diff options
context:
space:
mode:
authorNicholas Piggin <npiggin@gmail.com>2025-03-18 15:03:48 +1000
committerNicholas Piggin <npiggin@gmail.com>2025-03-20 14:48:17 +1000
commitfb802acdc8b162084e9e60d42aeba79097d14d2b (patch)
tree79e4a0fd1aa6bc9d9c727892c9213e5fbfe73e49 /hw/intc/xive.c
parent1dae461a913f9da88df05de6e2020d3134356f2e (diff)
downloadqemu-fb802acdc8b162084e9e60d42aeba79097d14d2b.zip
qemu-fb802acdc8b162084e9e60d42aeba79097d14d2b.tar.gz
qemu-fb802acdc8b162084e9e60d42aeba79097d14d2b.tar.bz2
ppc/spapr: Fix RTAS stopped state
This change takes the CPUPPCState 'quiesced' field added for powernv hardware CPU core controls (used to stop and start cores), and extends it to spapr to model the "RTAS stopped" state. This prevents the schedulers attempting to run stopped CPUs unexpectedly, which can cause hangs and possibly other unexpected behaviour. The detail of the problematic situation is this: A KVM spapr guest boots with all secondary CPUs defined to be in the "RTAS stopped" state. In this state, the CPU is only responsive to the start-cpu RTAS call. This behaviour is modeled in QEMU with the start_powered_off feature, which sets ->halted on secondary CPUs at boot. ->halted=true looks like an idle / sleep / power-save state which typically is responsive to asynchronous interrupts, but spapr clears wake-on-interrupt bits in the LPCR SPR. This more-or-less works. Commit e8291ec16da8 ("target/ppc: fix timebase register reset state") recently caused the decrementer to expire sooner at boot, causing a decrementer exception on secondary CPUs in RTAS stopped state. This was not a problem on TCG, but KVM limits how a guest can modify LPCR, in particular it prevents the clearing of wake-on-interrupt bits, and so in the course of CPU register synchronisation, the LPCR as set by spapr to model the RTAS stopped state is overwritten with KVM's LPCR value, and that then causes QEMU's interrupt code to notice the expired decrementer exception, turn that into an interrupt, and set CPU_INTERRUPT_HARD. That causes the CPU to be kicked, and the KVM vCPU thread to loop calling kvm_cpu_exec(). kvm_cpu_exec() calls kvm_arch_process_async_events(), which on ppc just returns ->halted. This is still true, so it returns immediately with EXCP_HLT, and the vCPU never goes to sleep because qemu_wait_io_event() sees CPU_INTERRUPT_HARD is set. All this while the vCPU holds the bql. This causes the boot CPU to eventually lock up when it needs the bql. So make 'quiesced' represent the "RTAS stopped" state, and have it explicitly not respond to exceptions (interrupt conditions) rather than rely on machine register state to model that state. This matches the powernv quiesced state very well because it essentially turns off the CPU core via a side-band control unit. There are still issues with QEMU and KVM idea of LPCR diverging and that is quite ugly and fragile that should be fixed. spapr should synchronize its LPCR properly with KVM, and not try to use values that KVM does not support. Reported-by: Misbah Anjum N <misanjum@linux.ibm.com> Tested-by: Misbah Anjum N <misanjum@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Diffstat (limited to 'hw/intc/xive.c')
0 files changed, 0 insertions, 0 deletions