aboutsummaryrefslogtreecommitdiff
path: root/softmmu
AgeCommit message (Collapse)AuthorFilesLines
2022-01-07util/oslib-posix: Forward SIGBUS to MCE handler under LinuxDavid Hildenbrand1-0/+4
Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-05qemu-options: Remove the deprecated -no-quit optionThomas Huth1-7/+1
This option was just a wrapper around the -display ...,window-close=off parameter, and the name "no-quit" is rather confusing compared to "window-close" (since there are still other means to quit the emulator), so let's remove this now. Message-Id: <20211215082417.180735-1-thuth@redhat.com> Acked-by: Michal Prívozník <mprivozn@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-31hw/core/machine: Introduce CPU cluster topology supportYanan Wang1-0/+3
The new Cluster-Aware Scheduling support has landed in Linux 5.16, which has been proved to benefit the scheduling performance (e.g. load balance and wake_affine strategy) on both x86_64 and AArch64. So now in Linux 5.16 we have four-level arch-neutral CPU topology definition like below and a new scheduler level for clusters. struct cpu_topology { int thread_id; int core_id; int cluster_id; int package_id; int llc_id; cpumask_t thread_sibling; cpumask_t core_sibling; cpumask_t cluster_sibling; cpumask_t llc_sibling; } A cluster generally means a group of CPU cores which share L2 cache or other mid-level resources, and it is the shared resources that is used to improve scheduler's behavior. From the point of view of the size range, it's between CPU die and CPU core. For example, on some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node, and 4 CPU cores in each cluster. The 4 CPU cores share a separate L2 cache and a L3 cache tag, which brings cache affinity advantage. In virtualization, on the Hosts which have pClusters (physical clusters), if we can design a vCPU topology with cluster level for guest kernel and have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can also make use of the cache affinity of CPU clusters to gain similar scheduling performance. This patch adds infrastructure for CPU cluster level topology configuration and parsing, so that the user can specify cluster parameter if their machines support it. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [PMD: Added '(since 7.0)' to @clusters in qapi/machine.json] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31dma: Let dma_buf_rw() propagate MemTxResultPhilippe Mathieu-Daudé1-6/+19
dma_memory_rw() returns a MemTxResult type. Do not discard it, return it to the caller. Since dma_buf_rw() was previously returning the QEMUSGList size not consumed, add an extra argument where this size can be stored. Update the 2 callers. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-14-philmd@redhat.com>
2021-12-31dma: Let dma_buf_read() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-3/+2
Let devices specify transaction attributes when calling dma_buf_read(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-13-philmd@redhat.com>
2021-12-31dma: Let dma_buf_write() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-3/+2
Let devices specify transaction attributes when calling dma_buf_write(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-12-philmd@redhat.com>
2021-12-31dma: Let dma_buf_rw() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-5/+6
Let devices specify transaction attributes when calling dma_buf_rw(). Keep the default MEMTXATTRS_UNSPECIFIED in the 2 callers. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-11-philmd@redhat.com>
2021-12-30dma: Have dma_buf_read() / dma_buf_write() take a void pointerPhilippe Mathieu-Daudé1-2/+2
DMA operations are run on any kind of buffer, not arrays of uint8_t. Convert dma_buf_read/dma_buf_write functions to take a void pointer argument and save us pointless casts to uint8_t *. Remove this pointless casts in the megasas device model. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-9-philmd@redhat.com>
2021-12-30dma: Have dma_buf_rw() take a void pointerPhilippe Mathieu-Daudé1-1/+2
DMA operations are run on any kind of buffer, not arrays of uint8_t. Convert dma_buf_rw() to take a void pointer argument to save us pointless casts to uint8_t *. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-8-philmd@redhat.com>
2021-12-30dma: Let dma_memory_map() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-1/+2
Let devices specify transaction attributes when calling dma_memory_map(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ - dma_memory_map(E1, E2, E3, E4) + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-7-philmd@redhat.com>
2021-12-30dma: Let dma_memory_rw() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-1/+2
Let devices specify transaction attributes when calling dma_memory_rw(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-5-philmd@redhat.com>
2021-12-30dma: Let dma_memory_set() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-3/+2
Let devices specify transaction attributes when calling dma_memory_set(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-3-philmd@redhat.com>
2021-11-29accel/tcg: suppress IRQ check for special TBsAlex Bennée1-2/+2
When we set cpu->cflags_next_tb it is because we want to carefully control the execution of the next TB. Currently there is a race that causes the second stage of watchpoint handling to get ignored if an IRQ is processed before we finish executing the instruction that triggers the watchpoint. Use the new CF_NOIRQ facility to avoid the race. We also suppress IRQs when handling precise self modifying code to avoid unnecessary bouncing. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211129140932.4115115-3-alex.bennee@linaro.org>
2021-11-22migration: fix dump-vmstate with modulesLaurent Vivier1-0/+1
To work correctly -dump-vmstate and vmstate-static-checker.py need to dump all the supported vmstates. But as some devices can be modules, they are not loaded at startup and not dumped. Fix that by loading all available modules before dumping the machine vmstate. Fixes: 7ab6e7fcce97 ("qdev: device module support") Cc: kraxel@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211116072840.132731-1-lvivier@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2021-11-15Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Richard Henderson1-2/+4
into staging pci,pc,virtio: bugfixes pci power management fixes acpi hotplug fixes misc other fixes Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 15 Nov 2021 05:15:09 PM CET # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] * tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: pcie: expire pending delete pcie: fast unplug when slot power is off pcie: factor out pcie_cap_slot_unplug() pcie: add power indicator blink check pcie: implement slot power control for pcie root ports pci: implement power state vdpa: Check for existence of opts.vhostdev vdpa: Replace qemu_open_old by qemu_open at virtio: use virtio accessor to access packed event virtio: use virtio accessor to access packed descriptor flags tests: bios-tables-test update expected blobs hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC bios-tables-test: Allow changes in DSDT ACPI tables hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type pcie: rename 'native-hotplug' to 'x-native-hotplug' hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines vhost: Fix last vq queue index of devices with no cvq vhost: Rename last_index to vq_index_end softmmu/qdev-monitor: fix use-after-free in qdev_set_id() net/vhost-vdpa: fix memory leak in vhost_vdpa_get_max_queue_pairs() Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-15pcie: expire pending deleteGerd Hoffmann1-1/+3
Add an expire time for pending delete, once the time is over allow pressing the attention button again. This makes pcie hotplug behave more like acpi hotplug, where one can try sending an 'device_del' monitor command again in case the guest didn't respond to the first attempt. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-7-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-11softmmu/qdev-monitor: fix use-after-free in qdev_set_id()Stefan Hajnoczi1-1/+1
Reported by Coverity (CID 1465222). Fixes: 4a1d937796de0fecd8b22d7dbebf87f38e8282fd ("softmmu/qdev-monitor: add error handling in qdev_set_id") Cc: Damien Hedde <damien.hedde@greensocs.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211102163342.31162-1-stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2021-11-10monitor: Fix find_device_state() for IDs containing slashesMarkus Armbruster1-7/+1
Recent commit 6952026120 "monitor: Tidy up find_device_state()" assumed the function's argument is "the device's ID or QOM path" (as documented for device_del). It's actually either an absolute QOM path, or a QOM path relative to /machine/peripheral/. Such a relative path is a device ID when it doesn't contain a slash. When it does, the function now always fails. Broke iotest 200, which uses relative path "vda/virtio-backend". It fails because object_resolve_path_component() resolves just one component, not a relative path. The obvious function to resolve relative paths is object_resolve_path(). It picks a parent automatically. Too much magic, we want to specify the parent. Create new object_resolve_path_at() for that, and use it in find_device_state(). Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20211019085711.86377-1-armbru@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-03Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingRichard Henderson2-5/+7
* Build system fixes and cleanups * DMA support in the multiboot option ROM * Rename default-bus-bypass-iommu * Deprecate -watchdog and cleanup -watchdog-action * HVF fix for <PAGE_SIZE regions * Support TSC scaling for AMD nested virtualization * Fix for ESP fuzzing bug # gpg: Signature made Tue 02 Nov 2021 10:57:37 AM EDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] * remotes/bonzini/tags/for-upstream: (27 commits) configure: fix --audio-drv-list help message configure: Remove the check for the __thread keyword Move the l2tpv3 test from configure to meson.build meson: remove unnecessary coreaudio test program meson: remove pointless warnings meson.build: Allow to disable OSS again meson: bump submodule to 0.59.3 qtest/am53c974-test: add test for cancelling in-flight requests esp: ensure in-flight SCSI requests are always cancelled KVM: SVM: add migration support for nested TSC scaling hw/i386: fix vmmouse registration watchdog: remove select_watchdog_action vl: deprecate -watchdog watchdog: add information from -watchdog help to -device help hw/i386: Rename default_bus_bypass_iommu hvf: Avoid mapping regions < PAGE_SIZE as ram configure: do not duplicate CPU_CFLAGS into QEMU_LDFLAGS configure: remove useless NPTL probe target/i386: use DMA-enabled multiboot ROM for new-enough QEMU machine types optionrom: add a DMA-enabled multiboot ROM ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-02qapi: introduce x-query-ramblock QMP commandDaniel P. Berrangé1-8/+11
This is a counterpart to the HMP "info ramblock" command. It is being added with an "x-" prefix because this QMP command is intended as an adhoc debugging tool and will thus not be modelled in QAPI as fully structured data, nor will it have long term guaranteed stability. The existing HMP command is rewritten to call the QMP command. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-11-02watchdog: remove select_watchdog_actionPaolo Bonzini1-5/+5
Instead of invoking select_watchdog_action from both HMP and command line, go directly from HMP to QMP and use QemuOpts as the intermediary for the command line. This makes -watchdog-action explicitly a shortcut for "-action watchdog", so that "-watchdog-action" and "-action watchdog" override each other based on the position on the command line; previously, "-action watchdog" always won. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02vl: deprecate -watchdogPaolo Bonzini1-0/+1
-watchdog is the same as -device except that it is case insensitive (and it allows only watchdog devices of course). Now that "-device help" can list as such the available watchdog devices, we can deprecate it. Note that even though -watchdog tries to be case insensitive, it fails at that: "-watchdog i6300xyz" fails with "Unknown -watchdog device", but "-watchdog i6300ESB" also fails (when the generated -device option is processed) with an error "'i6300ESB' is not a valid device model name". For this reason, the documentation update does not mention the case insensitivity of -watchdog. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02watchdog: add information from -watchdog help to -device helpPaolo Bonzini1-0/+1
List all watchdog devices in a separate category, and populate their descriptions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02Merge remote-tracking branch ↵Richard Henderson2-11/+33
'remotes/juanquintela/tags/migration-20211031-pull-request' into staging Migration Pull request Hi this includes pending bits of migration patches. - virtio-mem support by David Hildenbrand - dirtyrate improvements by Hyman Huang - fix rdma wrid by Li Zhijian - dump-guest-memory fixes by Peter Xu Pleas apply. Thanks, Juan. # gpg: Signature made Mon 01 Nov 2021 06:03:44 PM EDT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] * remotes/juanquintela/tags/migration-20211031-pull-request: migration/dirtyrate: implement dirty-bitmap dirtyrate calculation memory: introduce total_dirty_pages to stat dirty pages migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots migration/ram: Factor out populating pages readable in ram_block_populate_pages() migration: Simplify alignment and alignment checks migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination virtio-mem: Drop precopy notifier migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source virtio-mem: Implement replay_discarded RamDiscardManager callback memory: Introduce replay_discarded callback for RamDiscardManager dump-guest-memory: Block live migration migration: Add migrate_add_blocker_internal() migration: Make migration blocker work for snapshots too migration/dirtyrate: implement dirty-ring dirtyrate calculation migration/dirtyrate: move init step of calculation to main thread migration/dirtyrate: adjust order of registering thread migration/dirtyrate: introduce struct and adjust DirtyRateStat memory: make global_dirty_tracking a bitmask KVM: introduce dirty_pages and kvm_dirty_ring_enabled migration/rdma: Fix out of order wrid Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-01memory: Introduce replay_discarded callback for RamDiscardManagerDavid Hildenbrand1-0/+11
Introduce replay_discarded callback similar to our existing replay_populated callback, to be used my migration code to never migrate discarded memory. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01memory: make global_dirty_tracking a bitmaskHyman Huang(黄勇)2-11/+22
since dirty ring has been introduced, there are two methods to track dirty pages of vm. it seems that "logging" has a hint on the method, so rename the global_dirty_log to global_dirty_tracking would make description more accurate. dirty rate measurement may start or stop dirty tracking during calculation. this conflict with migration because stop dirty tracking make migration leave dirty pages out then that'll be a problem. make global_dirty_tracking a bitmask can let both migration and dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty tracking aims for, migration or dirty rate. Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01qdev-monitor: Check sysbus device type before creating itDamien Hedde1-0/+11
Add an early check to test if the requested sysbus device type is allowed by the current machine before creating the device. This impacts both -device cli option and device_add qmp command. Before this patch, the check was done well after the device has been created (in a machine init done notifier). We can now report the error right away. Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211029142258.484907-3-damien.hedde@greensocs.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-10-28softmmu: fix for "after access" watchpointsPavel Dovgalyuk1-1/+1
Watchpoints that should fire after the memory access break an execution of the current block, try to translate current instruction into the separate block, which then causes debug interrupt. But cpu_interrupt can't be called in such block when icount is enabled, because interrupts muse be allowed explicitly. This patch sets CF_LAST_IO flag for retranslated block, allowing interrupt request for the last instruction. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <163542169727.2127597.8141772572696627329.stgit@pasha-ThinkPad-X280> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-28softmmu: remove useless condition in watchpoint checkPavel Dovgalyuk1-21/+20
cpu_check_watchpoint function checks cpu->watchpoint_hit at the entry. But then it also does the same in the middle of the function, while this field can't change. That is why this patch removes this useless condition. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <163542169094.2127597.8801843697434113110.stgit@pasha-ThinkPad-X280> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-28softmmu: fix watchpoint processing in icount modePavel Dovgalyuk1-4/+2
Watchpoint processing code restores vCPU state twice: in tb_check_watchpoint and in cpu_loop_exit_restore/cpu_restore_state. Normally it does not affect anything, but in icount mode instruction counter is incremented twice and becomes incorrect. This patch eliminates unneeded CPU state restore. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <163542168516.2127597.8781375223437124644.stgit@pasha-ThinkPad-X280> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-23softmmu/physmem.c: Fix typo in commentGreg Kurz1-1/+1
Fix the comment to match what the code is doing, as explained in the changelog of commit 86cf9e154632cb28d749db0ea47946fba8cf3f09 that introduced the change: Commit 9458a9a1df1a4c719e24512394d548c1fc7abd22 added synchronization of vCPU and migration operations through calling run_on_cpu operation. However, in replay mode this synchronization is unneeded, because I/O and vCPU threads are already synchronized. This patch disables such synchronization for record/replay mode. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <163429018454.1146856.3429437540871060739.stgit@bahia.huguette> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-10-20device_tree: Add qemu_fdt_add_pathYanan Wang1-2/+42
qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except it also adds all missing subnodes from the given path. We'll use it in a coming patch where we will add cpu-map to the device tree. And we also tweak an error message of qemu_fdt_add_subnode(). Co-developed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20211020142125.7516-3-wangyanan55@huawei.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-20qdev/qbus: remove failover specific codeLaurent Vivier1-12/+6
Commit f3a850565693 ("qdev/qbus: add hidden device support") has introduced a generic way to hide a device but it has modified qdev_device_add() to check a specific option of the failover device, "failover_pair_id", before calling the generic mechanism. It's not needed (and not generic) to do that in qdev_device_add() because this is also checked by the failover_hide_primary_device() function that uses the generic mechanism to hide the device. Cc: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20211019071532.682717-3-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15vl: Enable JSON syntax for -deviceKevin Wolf1-7/+56
Like we already do for -object, introduce support for JSON syntax in -device, which can be kept stable in the long term and guarantees that a single code path with identical behaviour is used for both QMP and the command line. Compared to the QemuOpts based code, the parser contains less surprises and has support for non-scalar options (lists and structs). Switching management tools to JSON means that we can more easily change the "human" CLI syntax from QemuOpts to the keyval parser later. In the QAPI schema, a feature flag is added to the device-add command to allow management tools to detect support for this. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20211008133442.141332-16-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15qdev: Base object creation on QDict rather than QemuOptsKevin Wolf1-38/+31
QDicts are both what QMP natively uses and what the keyval parser produces. Going through QemuOpts isn't useful for either one, so switch the main device creation function to QDicts. By sharing more code with the -object/object-add code path, we can even reduce the code size a bit. This commit doesn't remove the detour through QemuOpts from any code path yet, but it allows the following commits to do so. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211008133442.141332-15-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15qdev: Add Error parameter to hide_device() callbacksKevin Wolf1-1/+4
hide_device() is used for virtio-net failover, where the standby virtio device delays creation of the primary device. It only makes sense to have a single primary device for each standby device. Adding a second one should result in an error instead of hiding it and never using it afterwards. Prepare for this by adding an Error parameter to the hide_device() callback where virtio-net is informed about adding a primary device. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211008133442.141332-12-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15softmmu/qdev-monitor: add error handling in qdev_set_idDamien Hedde1-10/+29
qdev_set_id() is mostly used when the user adds a device (using -device cli option or device_add qmp command). This commit adds an error parameter to handle the case where the given id is already taken. Also document the function and add a return value in order to be able to capture success/failure: the function now returns the id in case of success, or NULL in case of failure. The commit modifies the 2 calling places (qdev-monitor and xen-legacy-backend) to add the error object parameter. Note that the id is, right now, guaranteed to be unique because all ids came from the "device" QemuOptsList where the id is used as key. This addition is a preparation for a future commit which will relax the uniqueness. Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211008133442.141332-10-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15qdev: Make DeviceState.id independent of QemuOptsKevin Wolf1-2/+3
DeviceState.id is a pointer to a string that is stored in the QemuOpts object DeviceState.opts and freed together with it. We want to create devices without going through QemuOpts in the future, so make this a separately allocated string. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211008133442.141332-9-kwolf@redhat.com> Reviewed-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-15qdev: Avoid using string visitor for propertiesKevin Wolf1-3/+17
The only thing the string visitor adds compared to a keyval visitor is list support. git grep for 'visit_start_list' and 'visit.*List' shows that devices don't make use of this. In a world with a QAPIfied command line interface, the keyval visitor is used to parse the command line. In order to make sure that no devices start using this feature that would make backwards compatibility harder, just switch away from object_property_parse(), which internally uses the string visitor, to a keyval visitor and object_property_set(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20211008133442.141332-8-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-13Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211013' into stagingRichard Henderson1-10/+10
Use MO_128 for 16-byte atomic memory operations. Add cpu_ld/st_mmu memory primitives. Move helper_ld/st memory helpers out of tcg.h. Canonicalize alignment flags in MemOp. # gpg: Signature made Wed 13 Oct 2021 10:48:45 AM PDT # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate] * remotes/rth/tags/pull-tcg-20211013: tcg: Canonicalize alignment flags in MemOp tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h target/arm: Use cpu_*_mmu instead of helper_*_mmu target/sparc: Use cpu_*_mmu instead of helper_*_mmu target/s390x: Use cpu_*_mmu instead of helper_*_mmu target/mips: Use 8-byte memory ops for msa load/store target/mips: Use cpu_*_data_ra for msa load/store accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h accel/tcg: Add cpu_{ld,st}*_mmu interfaces target/hexagon: Implement cpu_mmu_index target/s390x: Use MO_128 for 16 byte atomics target/ppc: Use MO_128 for 16 byte atomics target/i386: Use MO_128 for 16 byte atomics target/arm: Use MO_128 for 16 byte atomics memory: Log access direction for invalid accesses Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-13memory: Log access direction for invalid accessesBALATON Zoltan1-10/+10
In memory_region_access_valid() invalid accesses are logged to help debugging but the log message does not say if it was a read or write. Log that too to better identify the access causing the problem. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-Id: <20211011173616.F1DE0756022@zero.eik.bme.hu> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-13monitor: Tidy up find_device_state()Markus Armbruster1-8/+5
Commit 6287d827d4 "monitor: allow device_del to accept QOM paths" extended find_device_state() to accept QOM paths in addition to qdev IDs. This added a checked conversion to TYPE_DEVICE at the end, which duplicates the check done for the qdev ID case earlier, except it sets a *different* error: GenericError "ID is not a hotpluggable device" when passed a QOM path, and DeviceNotFound "Device 'ID' not found" when passed a qdev ID. Fortunately, the latter won't happen as long as we add only devices to /machine/peripheral/. Earlier, commit b6cc36abb2 "qdev: device_del: Search for to be unplugged device in 'peripheral' container" rewrote the lookup by qdev ID to use QOM instead of qdev_find_recursive(), so it can handle buss-less devices. It does so by constructing an absolute QOM path. Works, but object_resolve_path_component() is easier. Switching to it also gets rid of the unclean duplication described above. While there, avoid converting to TYPE_DEVICE twice, first to check whether it's possible, and then for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210916111707.84999-1-armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-02softmmu/memory_mapping: optimize for RamDiscardManager sectionsDavid Hildenbrand1-0/+20
virtio-mem logically plugs/unplugs memory within a sparse memory region and notifies via the RamDiscardManager interface when parts become plugged (populated) or unplugged (discarded). Currently, we end up (via the two users) 1) zeroing all logically unplugged/discarded memory during TPM resets. 2) reading all logically unplugged/discarded memory when dumping, to figure out the content is zero. 1) is always bad, because we assume unplugged memory stays discarded (and is already implicitly zero). 2) isn't that bad with anonymous memory, we end up reading the zero page (slow and unnecessary, though). However, once we use some file-backed memory (future use case), even reading will populate memory. Let's cut out all parts marked as not-populated (discarded) via the RamDiscardManager. As virtio-mem is the single user, this now means that logically unplugged memory ranges will no longer be included in the dump, which results in smaller dump files and faster dumping. virtio-mem has a minimum granularity of 1 MiB (and the default is usually 2 MiB). Theoretically, we can see quite some fragmentation, in practice we won't have it completely fragmented in 1 MiB pieces. Still, we might end up with many physical ranges. Both, the ELF format and kdump seem to be ready to support many individual ranges (e.g., for ELF it seems to be UINT32_MAX, kdump has a linear bitmap). Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Claudio Fontana <cfontana@suse.de> Cc: Thomas Huth <thuth@redhat.com> Cc: "Alex Bennée" <alex.bennee@linaro.org> Cc: Peter Xu <peterx@redhat.com> Cc: Laurent Vivier <lvivier@redhat.com> Cc: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210727082545.17934-5-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-02softmmu/memory_mapping: factor out adding physical memory rangesDavid Hildenbrand1-21/+20
Let's factor out adding a MemoryRegionSection to the list, to be reused in RamDiscardManager context next. Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Claudio Fontana <cfontana@suse.de> Cc: Thomas Huth <thuth@redhat.com> Cc: "Alex Bennée" <alex.bennee@linaro.org> Cc: Peter Xu <peterx@redhat.com> Cc: Laurent Vivier <lvivier@redhat.com> Cc: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210727082545.17934-4-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-02softmmu/memory_mapping: never merge ranges accross memory regionsDavid Hildenbrand1-1/+2
Let's make sure to not merge when different memory regions are involved. Unlikely, but theoretically possible. Acked-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Claudio Fontana <cfontana@suse.de> Cc: Thomas Huth <thuth@redhat.com> Cc: "Alex Bennée" <alex.bennee@linaro.org> Cc: Peter Xu <peterx@redhat.com> Cc: Laurent Vivier <lvivier@redhat.com> Cc: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210727082545.17934-3-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30memory: Add tracepoint for dirty syncPeter Xu2-0/+3
Trace at memory_region_sync_dirty_bitmap() for log_sync() or global_log_sync() on memory regions. One trace line should suffice when it finishes, so as to estimate the time used for each log sync process. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210817013706.30986-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30memory: Name all the memory listenersPeter Xu1-0/+1
Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30memory: Add RAM_PROTECTED flag to skip IOMMU mappingsSean Christopherson2-1/+7
Add a new RAMBlock flag to denote "protected" memory, i.e. memory that looks and acts like RAM but is inaccessible via normal mechanisms, including DMA. Use the flag to skip protected memory regions when mapping RAM for DMA in VFIO. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-13qdev: Support marking individual buses as 'full'Peter Maydell1-1/+6
By default, QEMU will allow devices to be plugged into a bus up to the bus class's device count limit. If the user creates a device on the command line or via the monitor and doesn't explicitly specify the bus to plug it in, QEMU will plug it into the first non-full bus that it finds. This is fine in most cases, but some machines have multiple buses of a given type, some of which are dedicated to on-board devices and some of which have an externally exposed connector for user-pluggable devices. One example is I2C buses. Provide a new function qbus_mark_full() so that a machine model can mark this kind of "internal only" bus as 'full' after it has created all the devices that should be plugged into that bus. The "find a non-full bus" algorithm will then skip the internal-only bus when looking for a place to plug in user-created devices. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210903151435.22379-2-peter.maydell@linaro.org
2021-09-06softmmu/vl: Deprecate the -sdl and -curses optionThomas Huth1-0/+3
It's not that much complicated to type "-display sdl" or "-display curses", so we should not clutter our main option name space with such simple wrapper options and rather present the users with a concise interface instead. Thus let's deprecate the "-sdl" and "-curses" wrapper options now. Message-Id: <20210825092023.81396-4-thuth@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>