aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-01-07tests: acpi: Add updated TPM related tablesStefan Berger3-2/+0
The updated TPM related tables have the following additions: Device (TPM) { Name (_HID, "MSFT0101" /* TPM 2.0 Security Device */) // _HID: Hardware ID + Name (_STR, "TPM 2.0 Device") // _STR: Description String + Name (_UID, One) // _UID: Unique ID Name (_STA, 0x0F) // _STA: Status Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Ani Sinha <ani@anisinha.ca> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Ani Sinha <ani@anisinha.ca> Message-id: 20211223022310.575496-4-stefanb@linux.ibm.com Message-Id: <20220104175806.872996-4-stefanb@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07acpi: tpm: Add missing device identification objectsStefan Berger2-0/+8
Add missing TPM device identification objects _STR and _UID. They will appear as files 'description' and 'uid' under Linux sysfs. Following inspection of sysfs entries for hardware TPMs we chose uid '1'. Cc: Shannon Zhao <shannon.zhaosl@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Ani Sinha <ani@anisinha.ca> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/708 Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Ani Sinha <ani@anisinha.ca> Reviewed-by: Shannon Zhao <shannon.zhaosl@gmail.com> Message-id: 20211223022310.575496-3-stefanb@linux.ibm.com Message-Id: <20220104175806.872996-3-stefanb@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2022-01-07tests: acpi: prepare for updated TPM related tablesStefan Berger1-0/+2
Replace existing TPM related tables, that are about to change, with empty files. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Ani Sinha <ani@anisinha.ca> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Ani Sinha <ani@anisinha.ca> Message-id: 20211223022310.575496-2-stefanb@linux.ibm.com Message-Id: <20220104175806.872996-2-stefanb@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com>
2022-01-07virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanupDaniil Tatianin1-6/+5
In case of an error during initialization in vhost_dev_init, vhostfd is closed in vhost_dev_cleanup. Remove close from err_virtio as it's both redundant and causes a double close on vhostfd. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Message-Id: <20211129125204.1108088-1-d-tatianin@yandex-team.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07hw/scsi/vhost-scsi: don't double close vhostfd on errorDaniil Tatianin1-1/+8
vhost_dev_init calls vhost_dev_cleanup on error, which closes vhostfd, don't double close it. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Message-Id: <20211129132358.1110372-2-d-tatianin@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07hw/scsi/vhost-scsi: don't leak vqs on errorDaniil Tatianin1-2/+4
vhost_dev_init calls vhost_dev_cleanup in case of an error during initialization, which zeroes out the entire vsc->dev as well as the vsc->dev.vqs pointer. This prevents us from properly freeing it in free_vqs. Keep a local copy of the pointer so we can free it later. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Message-Id: <20211129132358.1110372-1-d-tatianin@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07docs: reSTify virtio-balloon-stats documentation and move to docs/interopThomas Huth3-28/+32
The virtio-balloon-stats documentation might be useful for people that are implementing software that talks to QEMU via QMP, so this should reside in the docs/interop/ directory. While we're at it, also convert the file to restructured text and mention it in the MAINTAINERS file. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220105115245.420945-1-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07hw/i386/pc: Add missing property descriptionsThomas Huth1-0/+8
When running "qemu-system-x86_64 -M pc,help" I noticed that some properties were still missing their description. Add them now so that users get at least a slightly better idea what they are all about. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20211206134255.94784-1-thuth@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07acpihp: simplify acpi_pcihp_disable_root_busAni Sinha1-7/+2
Get rid of the static variable that keeps track of whether hotplug has been disabled on the root pci bus. Simply use qbus_is_hotpluggable() api to perform the same check. This eliminates additional if conditional and simplifies the function. Signed-off-by: Ani Sinha <ani@anisinha.ca> Message-Id: <1640764674-7784-1-git-send-email-ani@anirban.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07tests: acpi: SLIC: update expected blobsIgor Mammedov3-2/+0
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20211227193120.1084176-5-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07tests: acpi: add SLIC table testIgor Mammedov1-0/+15
When user uses '-acpitable' to add SLIC table, some ACPI tables (FADT) will change its 'Oem ID'/'Oem Table ID' fields to match that of SLIC. Test makes sure thati QEMU handles those fields correctly when SLIC table is added with '-acpitable' option. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20211227193120.1084176-4-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07tests: acpi: whitelist expected blobs before changing themIgor Mammedov3-0/+2
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20211227193120.1084176-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07acpi: fix QEMU crash when started with SLIC tableIgor Mammedov2-2/+4
if QEMU is started with used provided SLIC table blob, -acpitable sig=SLIC,oem_id='CRASH ',oem_table_id="ME",oem_rev=00002210,asl_compiler_id="",asl_compiler_rev=00000000,data=/dev/null it will assert with: hw/acpi/aml-build.c:61:build_append_padded_str: assertion failed: (len <= maxlen) and following backtrace: ... build_append_padded_str (array=0x555556afe320, str=0x555556afdb2e "CRASH ME", maxlen=0x6, pad=0x20) at hw/acpi/aml-build.c:61 acpi_table_begin (desc=0x7fffffffd1b0, array=0x555556afe320) at hw/acpi/aml-build.c:1727 build_fadt (tbl=0x555556afe320, linker=0x555557ca3830, f=0x7fffffffd318, oem_id=0x555556afdb2e "CRASH ME", oem_table_id=0x555556afdb34 "ME") at hw/acpi/aml-build.c:2064 ... which happens due to acpi_table_begin() expecting NULL terminated oem_id and oem_table_id strings, which is normally the case, but in case of user provided SLIC table, oem_id points to table's blob directly and as result oem_id became longer than expected. Fix issue by handling oem_id consistently and make acpi_get_slic_oem() return NULL terminated strings. PS: After [1] refactoring, oem_id semantics became inconsistent, where NULL terminated string was coming from machine and old way pointer into byte array coming from -acpitable option. That used to work since build_header() wasn't expecting NULL terminated string and blindly copied the 1st 6 bytes only. However commit [2] broke that by replacing build_header() with acpi_table_begin(), which was expecting NULL terminated string and was checking oem_id size. 1) 602b45820 ("acpi: Permit OEM ID and OEM table ID fields to be changed") 2) Fixes: 4b56e1e4eb08 ("acpi: build_fadt: use acpi_table_begin()/acpi_table_end() instead of build_header()") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/786 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20211227193120.1084176-2-imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Denis Lisov <dennis.lissov@gmail.com> Tested-by: Alexander Tsoy <alexander@tsoy.me> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07intel-iommu: correctly check passthrough during translationJason Wang1-15/+23
When scalable mode is enabled, the passthrough more is not determined by the context entry but PASID entry, so switch to use the logic of vtd_dev_pt_enabled() to determine the passthrough mode in vtd_do_iommu_translate(). Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220105041945.13459-2-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86David Hildenbrand2-2/+4
Set the new default to "auto", keeping it set to "off" for compat machines. This property is only available for x86 targets. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134039.29670-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLEDavid Hildenbrand2-0/+71
With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading unplugged memory is not supported. We have to fail feature negotiation in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle memory backends (or architectures) without support for the shared zeropage in the hypervisor cleanly. Without the shared zeropage, even reading an unpopulated virtual memory location can populate real memory and consequently consume memory in the hypervisor. We have a guaranteed shared zeropage only on MAP_PRIVATE anonymous memory. Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default long-term as even populating the shared zeropage can be problematic: for example, without THP support (possible) or without support for the shared huge zeropage with THP (unlikely), the PTE page tables to hold the shared zeropage entries can consume quite some memory that cannot be reclaimed easily. Third, there are other optimizations+features (e.g., protection of unplugged memory, reducing the total memory slot size and bitmap sizes) that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. We really only support x86 targets with virtio-mem for now (and Linux similarly only support x86), but that might change soon, so prepare for different targets already. Add a new "unplugged-inaccessible" tristate property for x86 targets: - "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy guests working. - "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests from using the device. - "auto" selects the default based on support for the shared zeropage. Warn in case the property is set to "off" and we don't have support for the shared zeropage. For existing compat machines, the property will default to "off", to not change the behavior but eventually warn about a problematic setup. Short-term, we'll set the property default to "auto" for new QEMU machines. Mid-term, we'll set the property default to "on" for new QEMU machines. Long-term, we'll deprecate the parameter and disallow legacy guests completely. The property has to match on the migration source and destination. "auto" will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long as the qemu command line (esp. memdev) match -- so "auto" is good enough for migration purposes and the parameter doesn't have to be migrated explicitly. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134039.29670-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLEDavid Hildenbrand1-3/+6
Let's synchronize the new feature flag, available in Linux since v5.16-rc1. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134039.29670-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07MAINTAINERS: Add a separate entry for acpi/VIOT tablesAni Sinha1-0/+7
All work related to VIOT tables are being done by Jean. Adding him as the maintainer for acpi VIOT table code in qemu. Signed-off-by: Ani Sinha <ani@anisinha.ca> Message-Id: <20211213045924.344214-1-ani@anisinha.ca> Acked-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio: signal after wrapping packed used_idxStefan Hajnoczi1-0/+1
Packed Virtqueues wrap used_idx instead of letting it run freely like Split Virtqueues do. If the used ring wraps more than once there is no way to compare vq->signalled_used and vq->used_idx in virtio_packed_should_notify() since they are modulo vq->vring.num. This causes the device to stop sending used buffer notifications when when virtio_packed_should_notify() is called less than once each time around the used ring. It is possible to trigger this with virtio-blk's dataplane notify_guest_bh() irq coalescing optimization. The call to virtio_notify_irqfd() (and virtio_packed_should_notify()) is deferred to a BH. If the guest driver is polling it can complete and submit more requests before the BH executes, causing the used ring to wrap more than once. The result is that the virtio-blk device ceases to raise interrupts and I/O hangs. Cc: Tiwei Bie <tiwei.bie@intel.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211130134510.267382-1-stefanha@redhat.com> Fixes: 86044b24e865fb9596ed77a4d0f3af8b90a088a1 ("virtio: basic packed virtqueue support") Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-mem: Support "prealloc=on" optionDavid Hildenbrand2-4/+39
For scarce memory resources, such as hugetlb, we want to be able to prealloc such memory resources in order to not crash later on access. On simple user errors we could otherwise easily run out of memory resources an crash the VM -- pretty much undesired. For ordinary memory devices, such as DIMMs, we preallocate memory via the memory backend for such use cases; however, with virtio-mem we're dealing with sparse memory backends; preallocating the whole memory backend destroys the whole purpose of virtio-mem. Instead, we want to preallocate memory when actually exposing memory to the VM dynamically, and fail plugging memory gracefully + warn the user in case preallocation fails. A common use case for hugetlb will be using "reserve=off,prealloc=off" for the memory backend and "prealloc=on" for the virtio-mem device. This way, no huge pages will be reserved for the process, but we can recover if there are no actual huge pages when plugging memory. Libvirt is already prepared for this. Note that preallocation cannot protect from the OOM killer -- which holds true for any kind of preallocation in QEMU. It's primarily useful only for scarce memory resources such as hugetlb, or shared file-backed memory. It's of little use for ordinary anonymous memory that can be swapped, KSM merged, ... but we won't forbid it. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-9-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Forward SIGBUS to MCE handler under LinuxDavid Hildenbrand2-3/+38
Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support concurrent os_mem_prealloc() invocationDavid Hildenbrand1-0/+9
Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITEDavid Hildenbrand1-0/+8
Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Don't create too many threads with small memory or little ↵David Hildenbrand1-2/+10
pages Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Introduce and use MemsetContext for touch_all_pages()David Hildenbrand1-26/+47
Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()David Hildenbrand2-21/+69
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Let touch_all_pages() return an errorDavid Hildenbrand1-12/+16
Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk deviceAndy Pei1-0/+1
Turn on pre-defined feature VIRTIO_BLK_F_SIZE_MAX for virtio blk device to avoid guest DMA request sizes which are too large for hardware spec. Signed-off-by: Andy Pei <andy.pei@intel.com> Message-Id: <1641202092-149677-1-git-send-email-andy.pei@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07hw/i386: expose a "smbios-entry-point-type" PC machine propertyEduardo Habkost4-2/+32
The i440fx and Q35 machine types are both hardcoded to use the legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible conservative choice because SeaBIOS only supports SMBIOS 2.1 EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points, and QEMU already uses this on the ARM virt machine type. This adds a property to allow the choice of SMBIOS entry point versions For example to opt in to 64-bit SMBIOS entry point: $QEMU -machine q35,smbios-entry-point-type=64 Based on a patch submitted by Daniel Berrangé. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07hw/smbios: Use qapi for SmbiosEntryPointTypeEduardo Habkost2-8/+14
This prepares for exposing the SMBIOS entry point type as a machine property on x86. Based on a patch from Daniel P. Berrangé. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211026151100.1691925-3-ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com>
2022-01-07smbios: Rename SMBIOS_ENTRY_POINT_* enumsEduardo Habkost5-9/+9
Rename the enums to match the naming style used by QAPI, and to use "32" and "64" instead of "20" and "31". This will allow us to more easily move the enum to the QAPI schema later. About the naming choice: "SMBIOS 2.1 entry point"/"SMBIOS 3.0 entry point" and "32-bit entry point"/"64-bit entry point" are synonymous in the SMBIOS specification. However, the phrases "32-bit entry point" and "64-bit entry point" are used more often. The new names also avoid confusion between the entry point format and the actual SMBIOS version reported in the entry point structure. For example: currently the 32-bit entry point actually report SMBIOS 2.8 support, not 2.1. Based on portions of a patch submitted by Daniel P. Berrangé. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211026151100.1691925-2-ehabkost@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07pcie_aer: Don't trigger a LSI if none are definedFrederic Barrat1-1/+3
Skip triggering an LSI when the AER root error status is updated if no LSI is defined for the device. We can have a root bridge with no LSI, MSI and MSI-X defined, for example on POWER systems. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Message-Id: <20211116170133.724751-4-fbarrat@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07pci: Export the pci_intx() functionFrederic Barrat2-5/+5
Move the pci_intx() definition to the PCI header file, so that it can be called from other PCI files. It is used by the next patch. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Message-Id: <20211116170133.724751-3-fbarrat@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07vhost-user-blk: propagate error return from generic vhostRoman Kagan1-1/+1
Fix the only callsite that doesn't propagate the error code from the generic vhost code. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-11-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07vhost: stick to -errno error return conventionRoman Kagan1-54/+46
The generic vhost code expects that many of the VhostOps methods in the respective backends set errno on errors. However, none of the existing backends actually bothers to do so. In a number of those methods errno from the failed call is clobbered by successful later calls to some library functions; on a few code paths the generic vhost code then negates and returns that errno, thus making failures look as successes to the caller. As a result, in certain scenarios (e.g. live migration) the device doesn't notice the first failure and goes on through its state transitions as if everything is ok, instead of taking recovery actions (break and reestablish the vhost-user connection, cancel migration, etc) before it's too late. To fix this, consolidate on the convention to return negated errno on failures throughout generic vhost, and use it for error propagation. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07vhost-user: stick to -errno error return conventionRoman Kagan1-178/+223
VhostOps methods in user_ops are not very consistent in their error returns: some return negated errno while others just -1. Make sure all of them consistently return negated errno. This also helps error propagation from the functions being called inside. Besides, this synchronizes the error return convention with the other two vhost backends, kernel and vdpa, and will therefore allow for consistent error propagation in the generic vhost code (in a followup patch). Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07vhost-vdpa: stick to -errno error return conventionRoman Kagan1-14/+23
Almost all VhostOps methods in vdpa_ops follow the convention of returning negated errno on error. Adjust the few that don't. To that end, rework vhost_vdpa_add_status to check if setting of the requested status bits has succeeded and return the respective error code it hasn't, and propagate the error codes wherever it's appropriate. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-8-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07vhost-backend: stick to -errno error return conventionRoman Kagan1-1/+1
Almost all VhostOps methods in kernel_ops follow the convention of returning negated errno on error. Adjust the only one that doesn't. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-7-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07vhost-backend: avoid overflow on memslots_limitRoman Kagan1-1/+1
Fix the (hypothetical) potential problem when the value parsed out of the vhost module parameter in sysfs overflows the return value from vhost_kernel_memslots_limit. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-6-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07chardev/char-socket: tcp_chr_sync_read: don't clobber errnoRoman Kagan1-0/+3
After the return from tcp_chr_recv, tcp_chr_sync_read calls into a function which eventually makes a system call and may clobber errno. Make a copy of errno right after tcp_chr_recv and restore the errno on return from tcp_chr_sync_read. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-4-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2022-01-07chardev/char-socket: tcp_chr_recv: don't clobber errnoRoman Kagan1-7/+7
tcp_chr_recv communicates the specific error condition to the caller via errno. However, after setting it, it may call into some system calls or library functions which can clobber the errno. Avoid this by moving the errno assignment to the end of the function. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-3-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2022-01-07vhost-user-blk: reconnect on any error during realizeRoman Kagan1-1/+1
vhost-user-blk realize only attempts to reconnect if the previous connection attempt failed on "a problem with the connection and not an error related to the content (which would fail again the same way in the next attempt)". However this distinction is very subtle, and may be inadvertently broken if the code changes somewhere deep down the stack and a new error gets propagated up to here. OTOH now that the number of reconnection attempts is limited it seems harmless to try reconnecting on any error. So relax the condition of whether to retry connecting to check for any error. This patch amends a527e312b5 "vhost-user-blk: Implement reconnection during realize". Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-2-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07trace-events,pci: unify trace events formatLaurent Vivier3-8/+10
Unify format used by trace_pci_update_mappings_del(), trace_pci_update_mappings_add(), trace_pci_cfg_write() and trace_pci_cfg_read() to print the device name and bus number, slot number and function number. For instance: pci_cfg_read virtio-net-pci 00:0 @0x20 -> 0xffffc00c pci_cfg_write virtio-net-pci 00:0 @0x20 <- 0xfea0000c pci_update_mappings_del d=0x555810b92330 01:00.0 4,0xffffc000+0x4000 pci_update_mappings_add d=0x555810b92330 01:00.0 4,0xfea00000+0x4000 becomes pci_cfg_read virtio-net-pci 01:00.0 @0x20 -> 0xffffc00c pci_cfg_write virtio-net-pci 01:00.0 @0x20 <- 0xfea0000c pci_update_mappings_del virtio-net-pci 01:00.0 4,0xffffc000+0x4000 pci_update_mappings_add virtio-net-pci 01:00.0 4,0xfea00000+0x4000 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20211105192541.655831-1-lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-pci: add support for configure interruptCindy Lu2-13/+83
Add support for configure interrupt, The process is used kvm_irqfd_assign to set the gsi to kernel. When the configure notifier was signal by host, qemu will inject a msix interrupt to guest Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-11-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-mmio: add support for configure interruptCindy Lu1-0/+27
Add configure interrupt support for virtio-mmio bus. This interrupt will be working while the backend is vhost-vdpa Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-10-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio-net: add support for configure interruptCindy Lu4-2/+22
Add functions to support configure interrupt in virtio_net The functions are config_pending and config_mask, while this input idx is VIRTIO_CONFIG_IRQ_IDX will check the function of configure interrupt. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-9-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06vhost: add support for configure interruptCindy Lu2-0/+80
Add functions to support configure interrupt. The configure interrupt process will start in vhost_dev_start and stop in vhost_dev_stop. Also add the functions to support vhost_config_pending and vhost_config_mask, for masked_config_notifier, we only use the notifier saved in vq 0. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-8-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06virtio: add support for configure interruptCindy Lu2-0/+33
Add the functions to support the configure interrupt in virtio The function virtio_config_guest_notifier_read will notify the guest if there is an configure interrupt. The function virtio_config_set_guest_notifier_fd_handler is to set the fd hander for the notifier Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-7-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06vhost-vdpa: add support for config interruptCindy Lu2-0/+8
Add new call back function in vhost-vdpa, this function will set the event fd to kernel. This function will be called in the vhost_dev_start and vhost_dev_stop Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-6-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06vhost: introduce new VhostOps vhost_set_config_callCindy Lu1-0/+3
This patch introduces new VhostOps vhost_set_config_call. This function allows the vhost to set the event fd to kernel Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-5-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>