aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-01-17virtio: Make disable-legacy/disable-modern compat properties optionalEduardo Habkost1-2/+3
The disable-legacy and disable-modern properties apply only to some virtio-pci devices. Make those properties optional. This fixes the crash introduced by commit f6e501a28ef9 ("virtio: Provide version-specific variants of virtio PCI devices"): $ qemu-system-x86_64 -machine pc-i440fx-2.6 \ -device virtio-net-pci-non-transitional Unexpected error in object_property_find() at qom/object.c:1092: qemu-system-x86_64: -device virtio-net-pci-non-transitional: can't apply \ global virtio-pci.disable-modern=on: Property '.disable-modern' not found Aborted (core dumped) Reported-by: Thomas Huth <thuth@redhat.com> Fixes: f6e501a28ef9 ("virtio: Provide version-specific variants of virtio PCI devices") Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17globals: Allow global properties to be optionalEduardo Habkost2-0/+6
Making some global properties optional will let us simplify compat code when a given property works on most (but not all) subclasses of a given type. Device types will be able to opt out from optional compat properties by simply not registering those properties. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: virtio 9p really requires CONFIG_VIRTFS to workJuan Quintela1-1/+1
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio crypto bits from virtio-pci.hJuan Quintela2-14/+14
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio gpu bits from virtio-pci.hJuan Quintela3-14/+15
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio serial bits from virtio-pciJuan Quintela5-96/+119
Virtio console and qga tests also depend on CONFIG_VIRTIO_SERIAL. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio net bits from virtio-pciJuan Quintela5-74/+100
Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio blk bits from virtio-pciJuan Quintela5-77/+103
Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio scsi bits from virtio-pciJuan Quintela5-86/+109
Notice that we can't still run tests with it disabled. Both cdrom-test and drive_del-test use virtio-scsi without checking if it is enabled. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split vhost scsi bits from virtio-pciJuan Quintela4-80/+98
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split vhost user scsi bits from virtio-pciJuan Quintela4-71/+104
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split vhost user blk bits from virtio-pciJuan Quintela4-80/+104
Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio 9p bits from virtio-pciJuan Quintela5-75/+90
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio balloon bits from virtio-pciJuan Quintela5-75/+98
Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio rng bits from virtio-pciJuan Quintela5-69/+90
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio input bits from virtio-pciJuan Quintela4-135/+158
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split virtio input host bits from virtio-pciJuan Quintela5-37/+50
For consistency with other devices, rename virtio_host_{initfn,pci_info} to virtio_input_host_{initfn,info}. Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio: split vhost vsock bits from virtio-pciJuan Quintela4-71/+89
Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio-net: changed VIRTIO_NET_F_RSC_EXT to be 61Yuri Benditovich1-1/+1
Allocated feature bit changed in spec draft per TC request. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio-net: support RSC v4/v6 tcp traffic for Windows HCKYuri Benditovich3-1/+751
This commit adds implementation of RX packets coalescing, compatible with requirements of Windows Hardware compatibility kit. The device enables feature VIRTIO_NET_F_RSC_EXT in host features if it supports extended RSC functionality as defined in the specification. This feature requires at least one of VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6. Windows guest driver acks this feature only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS is also present. If the guest driver acks VIRTIO_NET_F_RSC_EXT feature, the device coalesces TCPv4 and TCPv6 packets (if respective VIRTIO_NET_F_GUEST_TSO feature is on, populates extended RSC information in virtio header and sets VIRTIO_NET_HDR_F_RSC_INFO bit in header flags. The device does not recalculate checksums in the coalesced packet, so they are not valid. In this case: All the data packets in a tcp connection are cached to a single buffer in every receive interval, and will be sent out via a timer, the 'virtio_net_rsc_timeout' controls the interval, this value may impact the performance and response time of tcp connection, 50000(50us) is an experience value to gain a performance improvement, since the whql test sends packets every 100us, so '300000(300us)' passes the test case, it is the default value as well, tune it via the command line parameter 'rsc_interval' within 'virtio-net-pci' device, for example, to launch a guest with interval set as '500000': 'virtio-net-pci,netdev=hostnet1,bus=pci.0,id=net1,mac=00, guest_rsc_ext=on,rsc_interval=500000' The timer will only be triggered if the packets pool is not empty, and it'll drain off all the cached packets. 'NetRscChain' is used to save the segments of IPv4/6 in a VirtIONet device. A new segment becomes a 'Candidate' as well as it passed sanity check, the main handler of TCP includes TCP window update, duplicated ACK check and the real data coalescing. An 'Candidate' segment means: 1. Segment is within current window and the sequence is the expected one. 2. 'ACK' of the segment is in the valid window. Sanity check includes: 1. Incorrect version in IP header 2. An IP options or IP fragment 3. Not a TCP packet 4. Sanity size check to prevent buffer overflow attack. 5. An ECN packet Even though, there might more cases should be considered such as ip identification other flags, while it breaks the test because windows set it to the same even it's not a fragment. Normally it includes 2 typical ways to handle a TCP control flag, 'bypass' and 'finalize', 'bypass' means should be sent out directly, while 'finalize' means the packets should also be bypassed, but this should be done after search for the same connection packets in the pool and drain all of them out, this is to avoid out of order fragment. All the 'SYN' packets will be bypassed since this always begin a new' connection, other flags such 'URG/FIN/RST/CWR/ECE' will trigger a finalization, because this normally happens upon a connection is going to be closed, an 'URG' packet also finalize current coalescing unit. Statistics can be used to monitor the basic coalescing status, the 'out of order' and 'out of window' means how many retransmitting packets, thus describe the performance intuitively. Difference between ip v4 and v6 processing: Fragment length in ipv4 header includes itself, while it's not included for ipv6, thus means ipv6 can carry a real 65535 payload. Note that main goal of implementing this feature in software is to create reference setup for certification tests. In such setups guest migration is not required, so the coalesced packets not yet delivered to the guest will be lost in case of migration. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17tests: acpi: use AcpiSdtTable::aml instead of AcpiSdtTable::header::signatureIgor Mammedov2-16/+10
AcpiSdtTable::header::signature is the only remained field from AcpiTableHeader structure used by tests. Instead of using packed structure to access signature, just read it directly from table blob and remove no longer used AcpiSdtTable::header / union and keep only AcpiSdtTable::aml byte array. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17tests: acpi: squash sanitize_fadt_ptrs() into test_acpi_fadt_table()Igor Mammedov1-29/+10
some parts of sanitize_fadt_ptrs() do redundant job - locating FADT - checking original checksum There is no need to do it as test_acpi_fadt_table() already does that, so drop duplicate code and move remaining fixup code into test_acpi_fadt_table(). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17tests: smbios: fetch whole table in one step instead of reading it step by stepIgor Mammedov2-31/+1
replace a bunch of ACPI_READ_ARRAY/ACPI_READ_FIELD macro, that read SMBIOS table field by field with one memread() to fetch whole table at once and drop no longer used ACPI_READ_ARRAY/ACPI_READ_FIELD macro. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17tests: acpi: reuse fetch_table() in vmgenid-testIgor Mammedov4-110/+69
Move fetch_table() into acpi-utils.c renaming it to acpi_fetch_table() and reuse it in vmgenid-test that reads RSDT and then tables it references, to find and parse VMGNEID SSDT. While at it wrap RSDT referenced tables enumeration into FOREACH macro (similar to what we do with QLIST_FOREACH & co) to reuse it with bios and vmgenid tests. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14tests: acpi: reuse fetch_table() for fetching FACS and DSDTIgor Mammedov1-48/+30
It allows to remove a bit more of code duplication and reuse common utility to get ACPI tables from guest (modulo RSDP). While at it, consolidate signature checking into fetch_table() instead of open-codding it. Considering FACS is special and doesn't have checksum, make checksum validation optin, the same goes for signature verification. PS: By pure accident, patch also fixes FACS not being tested against reference table since it wasn't added to data::tables list. But we managed not to regress it since reference file was added by commit (d25979380 acpi unit-test: add test files) back in 2013 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14tests: acpi: simplify rsdt handlingIgor Mammedov1-82/+55
RSDT referenced tables always have length at offset 4 and checksum at offset 9, that's enough for reusing fetch_table() and replacing custom RSDT fetching code with it. While at it * merge fetch_rsdt_referenced_tables() into test_acpi_rsdt_table() * drop test_data::rsdt_table/rsdt_tables_addr/rsdt_tables_nr since we need this data only for duration of test_acpi_rsdt_table() to fetch other tables and use locals instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14tests: acpi: make sure FADT is fetched only onceIgor Mammedov1-11/+8
Whole FADT is fetched as part of RSDT referenced tables in fetch_rsdt_referenced_tables() albeit a bit later than when FADT is partially parsed in fadt_fetch_facs_and_dsdt_ptrs(). However there is no reason for calling fetch_rsdt_referenced_tables() so late, just move it right after we fetched RSDT and before fadt_fetch_facs_and_dsdt_ptrs(). That way we can reuse whole FADT fetched by fetch_rsdt_referenced_tables() and avoid duplicate custom fields fetching in fadt_fetch_facs_and_dsdt_ptrs(). While at it rename fadt_fetch_facs_and_dsdt_ptrs() to test_acpi_fadt_table(). The follow up patch will merge fadt_fetch_facs_and_dsdt_ptrs() into test_acpi_rsdt_table(), so that we would end up calling only test_acpi_FOO_table() for consistency for tables that require special processing. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14tests: acpi: use AcpiSdtTable::aml in consistent wayIgor Mammedov2-42/+28
Currently in the 1st case we store table body fetched from QEMU in AcpiSdtTable::aml minus it's header but in the 2nd case when we load reference aml from disk, it holds whole blob including header. More over in the 1st case, we read header in separate AcpiSdtTable::header structure and then jump over hoops to fixup tables and combine both. Treat AcpiSdtTable::aml as whole table blob approach in both cases and when fetching tables from QEMU, first get table length and then fetch whole table into AcpiSdtTable::aml instead if doing it field by field. As result * AcpiSdtTable::aml is used in consistent manner * FADT fixups use offsets from spec instead of being shifted by header length * calculating checksums and dumping blobs becomes simpler Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14util: check the return value of fcntl in qemu_set_{block, nonblock}Li Qiang1-2/+6
Assert that the return value is not an error. This is like commit 7e6478e7d4f for qemu_set_cloexec. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14vhost-user: fix ioeventfd_enabledLi Qiang1-1/+1
Currently, the vhost-user-test assumes the eventfd is available. However it's not true because the accel is qtest. So the 'vhost_set_vring_file' will not add fds to the msg and the server side of vhost-user-test will be broken. The bug is in 'ioeventfd_enabled'. We should make this function return true if not using kvm accel. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14tests: vhost-user-test: initialize 'fd' in chr_readLi Qiang1-1/+1
Currently when processing VHOST_USER_SET_VRING_CALL if 'qemu_chr_fe_get_msgfds' get no fd, the 'fd' will be a stack uninitialized value. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14qemu: avoid memory leak while remove diskJian Wang3-4/+9
Memset vhost_dev to zero in the vhost_dev_cleanup function. This causes dev.vqs to be NULL, so that vqs does not free up space when calling the g_free function. This will result in a memory leak. But you can't release vqs directly in the vhost_dev_cleanup function, because vhost_net will also call this function, and vhost_net's vqs is assigned by array. In order to solve this problem, we first save the pointer of vqs, and release the space of vqs after vhost_dev_cleanup is called. Signed-off-by: Jian Wang <wangjian161@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14hw/misc/ivshmem: Remove deprecated "ivshmem" legacy deviceThomas Huth6-258/+34
It's been marked as deprecated in QEMU v2.6.0 already, so really nobody should use the legacy "ivshmem" device anymore (but use ivshmem-plain or ivshmem-doorbell instead). Time to remove the deprecated device now. Belatedly also update a mention of the deprecated "ivshmem" in the file docs/specs/ivshmem-spec.txt to "ivshmem-doorbell". Missed in commit 5400c02b90b ("ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem"). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14msix: make pba size math more uniformDongli Zhang1-1/+1
In msix_exclusive_bar the bar_pba_size is more than what the pba is expected to have, although this never affects the bar size. Specifically, the math in msix_init_exclusive_bar allocates too much memory in some cases. For example consider nentries = 8. msix_exclusive_bar will give us bar_pba_size = 16. So 16 bytes. However 8 bytes would be enough - this is all that the spec requires. So in practice bar_pba_size sometimes allocates an extra 8 bytes but never more. Since each MSIX entry size is 16 bytes, and since we make sure that table+pba is a power of two, this always leaves a multiple of 16 bytes for the PBA, so extra 8 bytes have no effect. However, its ugly to have pba size temporary variable have an incorrect value. For consistency switch to the formula used in msix_init. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14pci/pcie: stop plug/unplug if the slot is lockedDavid Hildenbrand3-8/+20
We better stop right away. For now, errors would be partially ignored (so the guest might get informed or the device might get unplugged), although actual plug/unplug will be reported as failed to the user. While at it, properly move the check to the pre_plug handler for the plug case, as we can test the slot state before the device will be realized. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' ↵Peter Maydell5-54/+126
into staging x86 queue, 2019-01-14 * Reenable RDTSCP support on Opteron_G[345] CPU models CPU models (Borislav Petkov) * host-phys-bits-limit option for better control of 5-level EPT (Eduardo Habkost) * Disable MPX support on named CPU models (Paolo Bonzini) * expose HV_CPUID_ENLIGHTMENT_INFO.EAX and HV_CPUID_NESTED_FEATURES.EAX as feature words (Vitaly Kuznetsov) # gpg: Signature made Mon 14 Jan 2019 14:33:55 GMT # gpg: using RSA key 2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/x86-next-pull-request: i386/kvm: add a comment explaining why .feat_names are commented out for Hyper-V feature bits x86: host-phys-bits-limit option target/i386: Disable MPX support on named CPU models target-i386: Reenable RDTSCP support on Opteron_G[345] CPU models CPU models i386/kvm: expose HV_CPUID_ENLIGHTMENT_INFO.EAX and HV_CPUID_NESTED_FEATURES.EAX as feature words Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-14i386/kvm: add a comment explaining why .feat_names are commented out for ↵Vitaly Kuznetsov1-0/+7
Hyper-V feature bits Hyper-V .feat_names are, unlike hardware features, commented out and it is not obvious why we do that. Document the current status quo. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20181221141604.16935-1-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-14x86: host-phys-bits-limit optionEduardo Habkost2-0/+8
Some downstream distributions of QEMU set host-phys-bits=on by default. This worked very well for most use cases, because phys-bits really didn't have huge consequences. The only difference was on the CPUID data seen by guests, and on the handling of reserved bits. This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support"). Now choosing a large phys-bits value for a VM has bigger impact: it will make KVM use 5-level EPT even when it's not really necessary. This means using the host phys-bits value may not be the best choice. Management software could address this problem by manually configuring phys-bits depending on the size of the VM and the amount of MMIO address space required for hotplug. But this is not trivial to implement. However, there's another workaround that would work for most cases: keep using the host phys-bits value, but only if it's smaller than 48. This patch makes this possible by introducing a new "-cpu" option: "host-phys-bits-limit". Management software or users can make sure they will always use 4-level EPT using: "host-phys-bits=on,host-phys-bits-limit=48". This behavior is still not enabled by default because QEMU doesn't enable host-phys-bits=on by default. But users, management software, or downstream distributions may choose to change their defaults using the new option. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20181211192527.13254-1-ehabkost@redhat.com> [ehabkost: removed test code while some issues are addressed] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-14target/i386: Disable MPX support on named CPU modelsPaolo Bonzini2-7/+14
MPX support is being phased out by Intel; GCC has dropped it, Linux is also going to do that. Even though KVM will have special code to support MPX after the kernel proper stops enabling it in XCR0, we probably also want to deprecate that in a few years. As a start, do not enable it by default for any named CPU model starting with the 4.0 machine types; this include Skylake, Icelake and Cascadelake. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20181220121100.21554-1-pbonzini@redhat.com> Reviewed-by:   Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-14target-i386: Reenable RDTSCP support on Opteron_G[345] CPU models CPU modelsBorislav Petkov3-7/+20
The missing functionality was added ~3 years ago with the Linux commit 46896c73c1a4 ("KVM: svm: add support for RDTSCP") so reenable RDTSCP support on those CPU models. Opteron_G2 - being family 15, model 6, doesn't have RDTSCP support (the real hardware doesn't have it. K8 got RDTSCP support with the NPT models, i.e., models >= 0x40). Document the host's minimum required kernel version, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Message-ID: <20181212200803.GG6653@zn.tnic> [ehabkost: moved compat properties code to pc.c] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-14i386/kvm: expose HV_CPUID_ENLIGHTMENT_INFO.EAX and ↵Vitaly Kuznetsov3-40/+77
HV_CPUID_NESTED_FEATURES.EAX as feature words It was found that QMP users of QEMU (e.g. libvirt) may need HV_CPUID_ENLIGHTMENT_INFO.EAX/HV_CPUID_NESTED_FEATURES.EAX information. In particular, 'hv_tlbflush' and 'hv_evmcs' enlightenments are only exposed in HV_CPUID_ENLIGHTMENT_INFO.EAX. HV_CPUID_NESTED_FEATURES.EAX is exposed for two reasons: convenience (we don't need to export it from hyperv_handle_properties() and as future-proof for Enlightened MSR-Bitmap, PV EPT invalidation and direct virtual flush features. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20181126135958.20956-1-vkuznets@redhat.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-14Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190114' into ↵Peter Maydell45-1536/+3923
staging Xen queue * Xen PV backend 'qdevification'. Starting with xen_disk. * Performance improvements for xen-block. * Remove of the Xen PV domain builder. * bug fixes. # gpg: Signature made Mon 14 Jan 2019 13:46:33 GMT # gpg: using RSA key 0CF5572FD7FB55AF # gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" # gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8 # Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF * remotes/aperard/tags/pull-xen-20190114: (25 commits) xen-block: avoid repeated memory allocation xen-block: improve response latency xen-block: improve batching behaviour xen: Replace few mentions of xend by libxl Remove broken Xen PV domain builder xen: remove the legacy 'xen_disk' backend MAINTAINERS: add myself as a Xen maintainer xen: automatically create XenBlockDevice-s xen: add a mechanism to automatically create XenDevice-s... xen: add implementations of xen-block connect and disconnect functions... xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.c xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.c xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-block xen: add header and build dataplane/xen-block.c xen: remove unnecessary code from dataplane/xen-block.c xen: duplicate xen_disk.c as basis of dataplane/xen-block.c xen: add event channel interface for XenDevice-s xen: add grant table interface for XenDevice-s xen: add xenstore watcher infrastructure xen: create xenstore areas for XenDevice-s ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-14xen-block: avoid repeated memory allocationTim Smith1-5/+9
The xen-block dataplane currently allocates memory to hold the data for each request as that request is used, and frees it afterwards. Because it requires page-aligned blocks, this interacts poorly with non-page- aligned allocations and balloons the heap. Instead, allocate the maximum possible buffer size required for the protocol, which is BLKIF_MAX_SEGMENTS_PER_REQUEST (currently 11) pages when the request structure is created, and keep that buffer until it is destroyed. Since the requests are re-used via a free list, this should actually improve memory usage. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve response latencyTim Smith1-38/+18
If the I/O ring is full, the guest cannot send any more requests until some responses are sent. Only sending all available responses just before checking for new work does not leave much time for the guest to supply new work, so this will cause stalls if the ring gets full. Also, not completing reads as soon as possible adds latency to the guest. To alleviate that, complete IO requests as soon as they come back. xen_block_send_response() already returns a value indicating whether a notify should be sent, which is all the batching we need. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve batching behaviourTim Smith1-0/+35
When I/O consists of many small requests, performance is improved by batching them together in a single io_submit() call. When there are relatively few requests, the extra overhead is not worth it. This introduces a check to start batching I/O requests via blk_io_plug()/ blk_io_unplug() in an amount proportional to the number which were already in flight at the time we started reading the ring. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: Replace few mentions of xend by libxlAnthony PERARD3-4/+4
xend have been replaced by libxenlight (libxl) for many Xen releases now. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2019-01-14Remove broken Xen PV domain builderAnthony PERARD9-379/+0
It is broken since Xen 4.9 [1] and it will not build in Xen 4.12. Also, it is not built by default since QEMU 2.6. [1] https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg00313.html Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2019-01-14xen: remove the legacy 'xen_disk' backendPaul Durrant2-1012/+0
This backend has now been replaced by the 'xen-qdisk' XenDevice. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14MAINTAINERS: add myself as a Xen maintainerPaul Durrant1-0/+1
I have made many significant contributions to the Xen code in QEMU, particularly the recent patches introducing a new PV device framework. I intend to make further significant contributions, porting other PV back- ends to the new framework with the intent of eventually removing the legacy code. It therefore seems reasonable that I become a maintainer of the Xen code. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: automatically create XenBlockDevice-sPaul Durrant4-1/+391
This patch adds create and destroy function for XenBlockDevice-s so that they can be created automatically when the Xen toolstack instantiates a new PV backend via xenstore. When the XenBlockDevice is created this way it is also necessary to create a 'drive' which matches the configuration that the Xen toolstack has written into xenstore. This is done by formulating the parameters necessary for each 'blockdev' layer of the drive and then using qmp_blockdev_add() to create the layers. Also, for compatibility with the legacy 'xen_disk' implementation, an iothread is automatically created for the new XenBlockDevice. This, like the driver layers, will be destroyed after the XenBlockDevice is unrealized. The legacy backend scan for 'qdisk' is removed by this patch, which makes the 'xen_disk' code is redundant. The code will be removed by a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>