Age | Commit message (Collapse) | Author | Files | Lines |
|
QEMU tracks whether a vcpu is halted using CPUState::halted. E.g.,
after initialization or reset, halted is 0 for the BSP (vcpu 0)
and 1 for the APs (vcpu 1, 2, ...). A halted vcpu should not be
handed to the hypervisor to run (e.g. hax_vcpu_run()).
Under HAXM, Android Emulator sometimes boots into a "vcpu shutdown
request" error while executing in SeaBIOS, with the HAXM driver
logging a guest triple fault in vcpu 1, 2, ... at RIP 0x3. That is
ultimately because the HAX accelerator asks HAXM to run those APs
when they are still in the halted state.
Normally, the vcpu thread for an AP will start by looping in
qemu_wait_io_event(), until the BSP kicks it via a pair of IPIs
(INIT followed by SIPI). But because the HAX accelerator does not
honor cpu->halted, it allows the AP vcpu thread to proceed to
hax_vcpu_run() as soon as it receives any kick, even if the kick
does not come from the BSP. It turns out that emulator has a
worker thread which periodically kicks every vcpu thread (possibly
to collect CPU usage data), and if one of these kicks comes before
those by the BSP, the AP will start execution from the wrong RIP,
resulting in the aforementioned SMP boot failure.
The solution is inspired by the KVM accelerator (credit to
Chuanxiao Dong <chuanxiao.dong@intel.com> for the pointer):
1. Get rid of questionable logic that unconditionally resets
cpu->halted before hax_vcpu_run(). Instead, only reset it at the
right moments (there are only a few "unhalt" events).
2. Add a check for cpu->halted before hax_vcpu_run().
Note that although the non-Unrestricted Guest (!ug_platform) code
path also forcibly resets cpu->halted, it is left untouched,
because only the UG code path supports SMP guests.
The patch is first merged to android emulator with Change-Id:
I9c5752cc737fd305d7eace1768ea12a07309d716
Cc: Yu Ning <yu.ning@intel.com>
Cc: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Message-Id: <20190610021939.13669-1-colin.xu@intel.com>
|
|
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
|
|
Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h,
sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in
qemu-common.h. This necessitates inclusion of qemu-common.h into
headers, which is against the rules spelled out in qemu-common.h's
file comment.
Move tcg_enabled() & friends into their own header sysemu/tcg.h, and
adjust #include directives.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-2-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Rebased with conflicts resolved automatically, except for
accel/tcg/tcg-all.c]
|
|
Amusingly, we had already ignored the comment to keep this value
at the end of CPUState. This restores the minimum negative offset
from TCG_AREG0 for code generation.
For the couple of uses within qom/cpu.c, without NEED_CPU_H, add
a pointer from the CPUState object to the IcountDecr object within
CPUNegativeOffsetState.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
When the -seed option is given, call qemu_guest_random_seed_main,
putting the subsystem into deterministic mode. Pass derived seeds
to each cpu created; which is a no-op unless the subsystem is in
deterministic mode.
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
|
|
The various TARGET_cpu_list() take an fprintf()-like callback and a
FILE * to pass to it. Their callers (vl.c's main() via list_cpus(),
bsd-user/main.c's main(), linux-user/main.c's main()) all pass
fprintf() and stdout. Thus, the flexibility provided by the (rather
tiresome) indirection isn't actually used.
Drop the callback, and call qemu_printf() instead.
Calling printf() would also work, but would make the code unsuitable
for monitor context without making it simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190417191805.28198-10-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
|
|
dump_drift_info() takes an fprintf()-like callback and a FILE * to pass
to it.
Its only caller hmp_info_jit() passes monitor_fprintf() and a Monitor
* cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf(). The type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-6-armbru@redhat.com>
|
|
This enables CPU unplug under qtest.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218092202.26683-2-david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
We can have a race condition between qemu_cpu_kick_thread() and
qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
qemu_cpu_kick_thread() can try to kick a thread that is exiting.
pthread_kill() returns an error and qemu is stopped by an exit(1).
qemu:qemu_cpu_kick_thread: No such process
We can ignore safely this error.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We use cpu_stop_current() to ensure the current CPU has stopped
from places like qemu_system_reset_request(). Unfortunately its
current implementation has a race. It calls qemu_cpu_stop(),
which sets cpu->stopped to true even though the CPU hasn't
actually stopped yet. The main thread will look at the flags
set by qemu_system_reset_request() and call pause_all_vcpus().
pause_all_vcpus() waits for every cpu to have cpu->stopped true,
so it can continue (and we will start the system reset operation)
before the vcpu thread has got back to its top level loop.
Instead, just set cpu->stop and call cpu_exit(). This will
cause the vcpu to exit back to the top level loop, and there
(as part of the wait_io_event code) it will call qemu_cpu_stop().
This fixes bugs where the reset request appeared to be ignored
or the CPU misbehaved because the reset operation started
to change vcpu state while the vcpu thread was still using it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Tested-by: Jaap Crezee <jaap@jcz.nl>
Message-id: 20181207155911.12710-1-peter.maydell@linaro.org
|
|
This avoids the following I/O thread deadlock:
1) the I/O thread calls run_on_cpu for CPU 3 from a timer. single_tcg_halt_cond
is signaled
2) CPU 1 is running and exits. It finds no work item and enters CPU 2
3) because the I/O thread is stuck in run_on_cpu, the round-robin kick
timer never triggers, and CPU 3 never runs the work item
4) run_on_cpu never completes
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When all cpus are sleeping (e.g in WFI), to avoid a deadlock
in the main_loop, wake it up in order to start the warp timer.
Signed-off-by: Clement Deschamps <clement.deschamps@greensocs.com>
Message-Id: <20181021142103.19014-1-clement.deschamps@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
into staging
Error reporting patches for 2018-10-22
# gpg: Signature made Mon 22 Oct 2018 13:20:23 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2018-10-22: (40 commits)
error: Drop bogus "use error_setg() instead" admonitions
vpc: Fail open on bad header checksum
block: Clean up bdrv_img_create()'s error reporting
vl: Simplify call of parse_name()
vl: Fix exit status for -drive format=help
blockdev: Convert drive_new() to Error
vl: Assert drive_new() does not fail in default_drive()
fsdev: Clean up error reporting in qemu_fsdev_add()
spice: Clean up error reporting in add_channel()
tpm: Clean up error reporting in tpm_init_tpmdev()
numa: Clean up error reporting in parse_numa()
vnc: Clean up error reporting in vnc_init_func()
ui: Convert vnc_display_init(), init_keyboard_layout() to Error
ui/keymaps: Fix handling of erroneous include files
vl: Clean up error reporting in device_init_func()
vl: Clean up error reporting in parse_fw_cfg()
vl: Clean up error reporting in mon_init_func()
vl: Clean up error reporting in machine_set_property()
vl: Clean up error reporting in chardev_init_func()
qom: Clean up error reporting in user_creatable_add_opts_foreach()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
* RTC fixes (Artem)
* icount fixes (Artem)
* rr fixes (Pavel, myself)
* hotplug cleanup (Igor)
* SCSI fixes (myself)
* 4.20-rc1 KVM header update (myself)
* coalesced PIO support (Peng Hao)
* HVF fixes (Roman B.)
* Hyper-V refactoring (Roman K.)
* Support for Hyper-V IPI (Vitaly)
# gpg: Signature made Fri 19 Oct 2018 12:47:58 BST
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (47 commits)
replay: pass raw icount value to replay_save_clock
target/i386: kvm: just return after migrate_add_blocker failed
hyperv_testdev: add SynIC message and event testmodes
hyperv: process POST_MESSAGE hypercall
hyperv: add support for KVM_HYPERV_EVENTFD
hyperv: process SIGNAL_EVENT hypercall
hyperv: add synic event flag signaling
hyperv: add synic message delivery
hyperv: make overlay pages for SynIC
hyperv: only add SynIC in compatible configurations
hyperv: qom-ify SynIC
hyperv:synic: split capability testing and setting
i386: add hyperv-stub for CONFIG_HYPERV=n
default-configs: collect CONFIG_HYPERV* in hyperv.mak
hyperv: factor out arch-independent API into hw/hyperv
hyperv: make hyperv_vp_index inline
hyperv: split hyperv-proto.h into x86 and arch-independent parts
hyperv: rename kvm_hv_sint_route_set_sint
hyperv: make HvSintRoute reference-counted
hyperv: address HvSintRoute by X86CPU pointer
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Calling error_report() in a function that takes an Error ** argument
is suspicious. Convert a few that are actually warnings to
warn_report().
While there, split a warning consisting of multiple sentences to
conform to conventions spelled out in warn_report()'s contract.
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Wei Huang <wei@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20181017082702.5581-5-armbru@redhat.com>
|
|
This avoids lock recursion when REPLAY_CLOCK is called inside the
timers spinlock.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When we implemented per-vCPU TCG contexts, we forgot to also
distribute the tcg_time counter, which has remained as a global
accessed without any serialization, leading to potentially missed
counts.
Fix it by distributing the field over the TCG contexts, embedding
it into TCGProfile with a field called "cpu_exec_time", which is more
descriptive than "tcg_time". Add a function to query this value
directly, and for completeness, fill in the field in
tcg_profile_snapshot, even though its callers do not use it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181010144853.13005-5-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
This is an alternative fix to Marc-André's original patch.
Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20180927171724.30128-1-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In record/replay icount mode vCPU thread and iothread synchronize
the execution using the checkpoints.
vCPU thread processes the virtual timers and iothread processes all others.
When iothread wants to wake up sleeping vCPU thread, it sends dummy queued
work. Therefore it could be the following sequence of the events in
record mode:
- IO: sending dummy work
- IO: processing timers
- CPU: wakeup
- CPU: clearing dummy work
- CPU: processing virtual timers
But due to the races in replay mode the sequence may change:
- IO: sending dummy work
- CPU: wakeup
- CPU: clearing dummy work
- CPU: sleeping again because nothing to do
- IO: Processing timers
- CPU: zzzz
In this case vCPU will not wake up, because dummy work is not to be set up
again.
This patch tries to wake up the vCPU when it sleeps and the icount warp
checkpoint isn't met. It means that vCPU has something to do, because
there are no other reasons of non-matching warp checkpoint.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
--
v5: improve checking that vCPU is still sleeping
Message-Id: <20180912081945.3228.19776.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180910232752.31565-11-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180910232752.31565-10-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Even though writes of qemu_icount can safely race with reads in
qemu_icount_raw, qemu_icount is also read by icount_adjust, which
runs in the I/O thread. Therefore, writes do needs protection of
the vm_clock_lock; for simplicity the patch protects it with both
seqlock+spinlock, which we already do for hosts that lack 64-bit atomics.
The bug actually predated the introduction of vm_clock_lock;
cpu_update_icount would have needed the BQL before the spinlock was
introduced.
Reported-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We forgot to initialize the spinlock introduced in 94377115b2
("cpus: protect TimerState writes with a spinlock", 2018-08-23).
Fix it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180903171831.15446-5-cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
into staging
ppc patch queue 2018-09-07
Here's another pull request for qemu-3.1. No real theme here, just an
assortment of various fixes. Probably the most notable thing is the
removal of the ppcemb target which has been deprecated for some time
now.
# gpg: Signature made Fri 07 Sep 2018 08:30:02 BST
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-3.1-20180907:
target-ppc: Extend HWCAP2 bits for ISA 3.0
target/ppc/kvm: set vcpu as online/offline
Fix a deadlock case in the CPU hotplug flow
spapr: Correct reference count on spapr-cpu-core
mac_newworld: implement custom FWPathProvider
uninorth: add ofw-addr property to allow correct fw path generation
mac_oldworld: implement custom FWPathProvider
grackle: set device fw_name and address for correct fw path generation
macio: add addr property to macio IDE object
macio: add macio bus to help with fw path generation
macio: move MACIOIDEState type declarations to macio.h
spapr_pci: fix potential NULL pointer dereference
spapr: fix leak of rev array
ppc: Remove deprecated ppcemb target
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
The generated qapi_event_send_FOO() take an Error ** argument. They
can't actually fail, because all they do with the argument is passing it
to functions that can't fail: the QObject output visitor, and the
@qmp_emit callback, which is either monitor_qapi_event_queue() or
event_test_emit().
Drop the argument, and pass &error_abort to the QObject output visitor
and @qmp_emit instead.
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180815133747.25032-4-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message rewritten, update to qapi-code-gen.txt corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
There is no known available OS for ppc around anymore that uses page
sizes below 4k, so it does not make much sense that we keep wasting
our time on building and testing the ppcemb-softmmu target. It has
been deprecated since two releases, and nobody complained, so let's
remove this now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
Because of cpu_ticks_prev, we cannot use a seqlock. But then the conversion
is even easier. :)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In the next patch, we will need to write cpu_ticks_offset from any
thread, even outside the BQL. Currently, it is protected by the BQL
just because cpu_enable_ticks and cpu_disable_ticks happen to hold it,
but the critical sections are well delimited and it's easy to remove
the BQL dependency.
Add a spinlock that matches vm_clock_seqlock, and hold it when writing
to the TimerState. This also lets us fix cpu_update_icount when 64-bit
atomics are not available.
Fields of TiemrState are reordered to avoid padding.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Move the icount->ns computation to cpu_get_icount, and make
cpu_get_icount_locked return the raw value. This makes the
atomic_read__nocheck safe, because it now happens always inside a
seqlock and any torn reads will be retried. qemu_icount_bias and
icount_time_shift also need to be accessed with atomics. At the
same time, however, you don't need atomic_read within the writer,
because no concurrent writes are possible.
The fix to vmstate lets us keep the struct nicely packed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Iterating over the list without using atomics is undefined behaviour,
since the list can be modified concurrently by other threads (e.g.
every time a new thread is created in user-mode).
Fix it by implementing the CPU list as an RCU QTAILQ. This requires
a little bit of extra work to traverse list in reverse order (see
previous patch), but other than that the conversion is trivial.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180819091335.22863-12-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The BQL is acquired via qemu_mutex_lock_iothread(), which makes
the profiler assign the associated wait time (i.e. most of
BQL wait time) entirely to that function. This loses the original
call site information, which does not help diagnose BQL contention.
Fix it by tracking the callers explicitly.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix the --disable-tcg breakage introduced by tb_lock's removal by
relying on the fact that tcg_enabled() is set to 0 at
compile-time under --disable-tcg.
While at it, add further asserts to fix builds that enable both
--disable-tcg and --enable-debug, which were broken even before
tb_lock's removal.
Tested to build x86_64-softmmu and i386-softmmu targets.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Flat unions may now have uncovered branches, so it is possible to get rid
of empty types defined for that purpose only.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1529311206-76847-3-git-send-email-anton.nefedov@virtuozzo.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
Commit 9b0605f9837b ("cpus: tcg: unregister thread with RCU, fix
exiting of loop on unplug") changed the exit condition of the loop in
the vCPU thread function but forgot to remove the beginning 'while (1)'
statement. The resulting code :
while (1) {
...
} while (!cpu->unplug || cpu_can_run(cpu));
is a sequence of two distinct two while() loops, the first not exiting
in case of an unplug event.
Remove the first while (1) to fix CPU unplug.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20180425131828.15604-1-clg@kaod.org>
Cc: qemu-stable@nongnu.org
Fixes: 9b0605f9837b68fd56c7fc7c96a3a1a3b983687d
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
Reproducer:
1. Create a scratch image
$ dd if=/dev/zero of=scratch.img bs=1M count=100
Size doesn't actually matter.
2. Prepare blkdebug configuration:
$ cat >blkdebug.conf <<EOF
[inject-error]
event = "write_aio"
errno = "5"
EOF
Note that errno 5 is EIO.
3. Run a guest with an additional scratch disk, i.e. with additional
arguments
-drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
-device virtio-blk-pci,id=scratch,drive=scratch-drive
The blkdebug part makes all writes to the scratch drive fail with
EIO. The werror=stop pauses the guest on write errors.
4. Connect to the QMP socket e.g. like this:
$ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
Issue QMP command 'qmp_capabilities':
QMP> { "execute": "qmp_capabilities" }
5. Boot the guest.
6. In the guest, write to the scratch disk, e.g. like this:
# dd if=/dev/zero of=/dev/vdb count=1
Do double-check the device specified with of= is actually the
scratch device!
7. Issue QMP command 'cont':
QMP> { "execute": "cont" }
After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event. Good.
After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP. Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".
The culprit is vm_prepare_start().
/* Ensure that a STOP/RESUME pair of events is emitted if a
* vmstop request was pending. The BLOCK_IO_ERROR event, for
* example, according to documentation is always followed by
* the STOP event.
*/
if (runstate_is_running()) {
qapi_event_send_stop(&error_abort);
res = -1;
} else {
replay_enable_events();
cpu_enable_ticks();
runstate_set(RUN_STATE_RUNNING);
vm_state_notify(1, RUN_STATE_RUNNING);
}
/* We are sending this now, but the CPUs will be resumed shortly later */
qapi_event_send_resume(&error_abort);
return res;
When resuming a stopped guest, we take the else branch before we get
to sending RESUME. vm_state_notify() runs virtio_vmstate_change(),
among other things. This restarts I/O, triggering the BLOCK_IO_ERROR
event.
Reshuffle vm_prepare_start() to send the RESUME event earlier.
Fixes RHBZ 1566153.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180423084518.2426-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a new field @target (of type @SysEmuTarget) to the output of the
@query-cpus-fast command, which provides more information about the
emulation target than the field @arch (of type @CpuInfoArch). Make @target
the new discriminator for the @CpuInfoFast return structure. Keep @arch
for compatibility.
Cc: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180427192852.15013-5-lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
* Commit ca230ff33f89 added the @arch field to @CpuInfoFast, but it failed
to set the new field in qmp_query_cpus_fast(), when TARGET_S390X was not
defined. The updated @query-cpus-fast example in "qapi-schema.json"
showed "arch":"x86" only because qmp_query_cpus_fast() calls g_malloc0()
to allocate @CpuInfoFast, and the CPU_INFO_ARCH_X86 enum constant is
generated with value 0.
All @arch values other than @s390 implied the @CpuInfoOther sub-struct
for @CpuInfoFast -- at the time of writing the patch --, thus no fields
other than @arch needed to be set when TARGET_S390X was not defined. Set
@arch now, by copying the corresponding assignments from
qmp_query_cpus().
* Commit 25fa194b7b11 added the @riscv enum constant to @CpuInfoArch (used
in both @CpuInfo and @CpuInfoFast -- the return types of the @query-cpus
and @query-cpus-fast commands, respectively), and assigned, in both
return structures, the @CpuInfoRISCV sub-structure to the new enum
value.
However, qmp_query_cpus_fast() would not populate either the @arch field
or the @CpuInfoRISCV sub-structure, when TARGET_RISCV was defined; only
qmp_query_cpus() would.
Assign @CpuInfoOther to the @riscv enum constant in @CpuInfoFast, and
populate only the @arch field in qmp_query_cpus_fast(). Getting CPU
state without interrupting KVM is an exceptional thing that only S390X
does currently. Quoting Cornelia Huck <cohuck@redhat.com>, "s390x is
exceptional in that it has state in QEMU that is actually interesting
for upper layers and can be retrieved without performance penalty". See
also
<https://www.redhat.com/archives/libvir-list/2018-February/msg00121.html>.
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Viktor VM Mihajlovski <mihajlov@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Fixes: ca230ff33f89bf7102cbfbc2328716da6750aaed
Fixes: 25fa194b7b11901561532e435beb83d046899f7a
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180427192852.15013-2-lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
When we run in TCG icount mode, we calculate the number of instructions
to execute using tcg_get_icount_limit(), which ensures that we stop
execution at the next timer deadline. However there is a bug where
currently we do not recalculate that limit if the guest reprograms
a timer so that the next deadline moves closer, and so we will
continue execution until the original limit and fire the timer
later than we should.
Fix this bug in qemu_timer_notify_cb(): if we are currently running
a VCPU in icount mode, we simply need to kick it out of the main
loop and back to tcg_cpu_exec(), where it will recalculate the
icount limit. If we are not currently running a VCPU, then we
retain the existing logic for waking up a halted CPU.
Cc: qemu-stable@nongnu.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1754038
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180406123838.21249-1-peter.maydell@linaro.org
|
|
Now instead of using the replay_lock to guard the output of the log we
now use it to protect the whole execution section. This replaces what
the BQL used to do when it was held during TCG execution.
We also introduce some rules for locking order - mainly that you
cannot take the replay_mutex while holding the BQL. This leads to some
slight sophistry during start-up and extending the
replay_mutex_destroy function to unlock the mutex without checking
for the BQL condition so it can be cleanly dropped in the non-replay
case.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095248.1060.40374.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
staging
# gpg: Signature made Fri 09 Mar 2018 13:19:02 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
vl: introduce vm_shutdown()
virtio-scsi: fix race between .ioeventfd_stop() and vq handler
virtio-blk: fix race between .ioeventfd_stop() and vq handler
block: add aio_wait_bh_oneshot()
virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present
README: Fix typo 'git-publish'
block: Fix qemu crash when using scsi-block
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Commit 00d09fdbbae5f7864ce754913efc84c12fdf9f1a ("vl: pause vcpus before
stopping iothreads") and commit dce8921b2baaf95974af8176406881872067adfa
("iothread: Stop threads before main() quits") tried to work around the
fact that emulation was still active during termination by stopping
iothreads. They suffer from race conditions:
1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
virtio_scsi_ctx_check() assertion failure because the BDS AioContext
has been modified by iothread_stop_all().
2. Guest vq kick racing with main loop termination leaves a readable
ioeventfd that is handled by the next aio_poll() when external
clients are enabled again, resulting in unwanted emulation activity.
This patch obsoletes those commits by fully disabling emulation activity
when vcpus are stopped.
Use the new vm_shutdown() function instead of pause_all_vcpus() so that
vm change state handlers are invoked too. Virtio devices will now stop
their ioeventfds, preventing further emulation activity after vm_stop().
Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a
QMP STOP event that may affect existing clients.
It is no longer necessary to call replay_disable_events() directly since
vm_shutdown() does so already.
Drop iothread_stop_all() since it is no longer used.
Cc: Fam Zheng <famz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
This adds RISC-V into the build system enabling the following targets:
- riscv32-softmmu
- riscv64-softmmu
- riscv32-linux-user
- riscv64-linux-user
This adds defaults configs for RISC-V, enables the build for the RISC-V
CPU core, hardware, and Linux User Emulation. The 'qemu-binfmt-conf.sh'
script is updated to add the RISC-V ELF magic.
Expected checkpatch errors for consistency reasons:
ERROR: line over 90 characters
FILE: scripts/qemu-binfmt-conf.sh
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
|
|
It can never happen for single-threaded TCG that we have more than one
CPU in the list, while the first one has not been marked as "created".
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-4-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We can now also wait for the CPU creation for single-threaded TCG, so we
can move the waiting bits further out.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-3-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
All but the first CPU are currently not fully inititalized (e.g.
cpu->created is never set).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-2-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
The s390 CPU state can be retrieved without interrupting the
VM execution. Extendend the CpuInfoFast union with architecture
specific data and an implementation for s390.
Return data looks like this:
[
{"thread-id":64301,"props":{"core-id":0},
"arch":"s390","cpu-state":"operating",
"qom-path":"/machine/unattached/device[0]","cpu-index":0},
{"thread-id":64302,"props":{"core-id":1},
"arch":"s390","cpu-state":"operating",
"qom-path":"/machine/unattached/device[1]","cpu-index":1}
]
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-4-git-send-email-mihajlov@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The query-cpus command has an extremely serious side effect:
it always interrupts all running vCPUs so that they can run
ioctl calls. This can cause a huge performance degradation for
some workloads. And most of the information retrieved by the
ioctl calls are not even used by query-cpus.
This commit introduces a replacement for query-cpus called
query-cpus-fast, which has the following features:
o Never interrupt vCPUs threads. query-cpus-fast only returns
vCPU information maintained by QEMU itself, which should be
sufficient for most management software needs
o Drop "halted" field as it can not be retrieved in a fast
way on most architectures
o Drop irrelevant fields such as "current", "pc" and "arch"
o Rename some fields for better clarification & proper naming
standard
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Message-Id: <1518797321-28356-3-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|