aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)AuthorFilesLines
2022-06-23hw/acpi: Make the PCI hot-plug aware of SR-IOVŁukasz Gieryk1-1/+5
PCI device capable of SR-IOV support is a new, still-experimental feature with only a single working example of the Nvme device. This patch in an attempt to fix a double-free problem when a SR-IOV-capable Nvme device is hot-unplugged in the following scenario: Qemu CLI: --------- -device pcie-root-port,slot=0,id=rp0 -device nvme-subsys,id=subsys0 -device nvme,id=nvme0,bus=rp0,serial=deadbeef,subsys=subsys0,sriov_max_vfs=1,sriov_vq_flexible=2,sriov_vi_flexible=1 Guest OS: --------- sudo nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0 sudo nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0 echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset sleep 1 echo 1 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0 sleep 2 echo 01:00.1 > /sys/bus/pci/drivers/nvme/bind Qemu monitor: ------------- device_del nvme0 Explanation of the problem and the proposed solution: 1) The current SR-IOV implementation assumes it’s the PhysicalFunction that creates and deletes VirtualFunctions. 2) It’s a design decision (the Nvme device at least) for the VFs to be of the same class as PF. Effectively, they share the dc->hotpluggable value. 3) When a VF is created, it’s added as a child node to PF’s PCI bus slot. 4) Monitor/device_del triggers the ACPI mechanism. The implementation is not aware of SR/IOV and ejects PF’s PCI slot, directly unrealizing all hot-pluggable (!acpi_pcihp_pc_no_hotplug) children nodes. 5) VFs are unrealized directly, and it doesn’t work well with (1). SR/IOV structures are not updated, so when it’s PF’s turn to be unrealized, it works on stale pointers to already-deleted VFs. The proposed fix is to make the PCI ACPI code aware of SR/IOV. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Update the initalization place for the AER queueŁukasz Gieryk1-2/+1
This patch updates the initialization place for the AER queue, so it’s initialized once, at controller initialization, and not every time controller is enabled. While the original version works for a non-SR-IOV device, as it’s hard to interact with the controller if it’s not enabled, the multiple reinitialization is not necessarily correct. With the SR/IOV feature enabled a segfault can happen: a VF can have its controller disabled, while a namespace can still be attached to the controller through the parent PF. An event generated in such case ends up on an uninitialized queue. While it’s an interesting question whether a VF should support AER in the first place, I don’t think it must be answered today. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Add support for the Virtualization Management commandŁukasz Gieryk3-2/+278
With the new command one can: - assign flexible resources (queues, interrupts) to primary and secondary controllers, - toggle the online/offline state of given controller. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Initialize capability structures for primary/secondary controllersŁukasz Gieryk2-7/+138
With four new properties: - sriov_v{i,q}_flexible, - sriov_max_v{i,q}_per_vf, one can configure the number of available flexible resources, as well as the limits. The primary and secondary controller capability structures are initialized accordingly. Since the number of available queues (interrupts) now varies between VF/PF, BAR size calculation is also adjusted. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Calculate BAR attributes in a functionŁukasz Gieryk1-14/+31
An NVMe device with SR-IOV capability calculates the BAR size differently for PF and VF, so it makes sense to extract the common code to a separate function. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Remove reg_size variable and update BAR0 size calculationŁukasz Gieryk2-6/+5
The n->reg_size parameter unnecessarily splits the BAR0 size calculation in two phases; removed to simplify the code. With all the calculations done in one place, it seems the pow2ceil, applied originally to reg_size, is unnecessary. The rounding should happen as the last step, when BAR size includes Nvme registers, queue registers, and MSIX-related space. Finally, the size of the mmio memory region is extended to cover the 1st 4KiB padding (see the map below). Access to this range is handled as interaction with a non-existing queue and generates an error trace, so actually nothing changes, while the reg_size variable is no longer needed. -------------------- | BAR0 | -------------------- [Nvme Registers ] [Queues ] [power-of-2 padding] - removed in this patch [4KiB padding (1) ] [MSIX TABLE ] [4KiB padding (2) ] [MSIX PBA ] [power-of-2 padding] Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtimeŁukasz Gieryk2-16/+38
The NVMe device defines two properties: max_ioqpairs, msix_qsize. Having them as constants is problematic for SR-IOV support. SR-IOV introduces virtual resources (queues, interrupts) that can be assigned to PF and its dependent VFs. Each device, following a reset, should work with the configured number of queues. A single constant is no longer sufficient to hold the whole state. This patch tries to solve the problem by introducing additional variables in NvmeCtrl’s state. The variables for, e.g., managing queues are therefore organized as: - n->params.max_ioqpairs – no changes, constant set by the user - n->(mutable_state) – (not a part of this patch) user-configurable, specifies number of queues available _after_ reset - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’ n->params.max_ioqpairs; initialized in realize() and updated during reset() to reflect user’s changes to the mutable state Since the number of available i/o queues and interrupts can change in runtime, buffers for sq/cqs and the MSIX-related structures are allocated big enough to handle the limits, to completely avoid the complicated reallocation. A helper function (nvme_update_msixcap_ts) updates the corresponding capability register, to signal configuration changes. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Implement the Function Level ResetŁukasz Gieryk3-4/+54
This patch implements the Function Level Reset, a feature currently not implemented for the Nvme device, while listed as a mandatory ("shall") in the 1.4 spec. The implementation reuses FLR-related building blocks defined for the pci-bridge module, and follows the same logic: - FLR capability is advertised in the PCIE config, - custom pci_write_config callback detects a write to the trigger register and performs the PCI reset, - which, eventually, calls the custom dc->reset handler. Depending on reset type, parts of the state should (or should not) be cleared. To distinguish the type of reset, an additional parameter is passed to the reset function. This patch also enables advertisement of the Power Management PCI capability. The main reason behind it is to announce the no_soft_reset=1 bit, to signal SR-IOV support where each VF can be reset individually. The implementation purposedly ignores writes to the PMCS.PS register, as even such naïve behavior is enough to correctly handle the D3->D0 transition. It’s worth to note, that the power state transition back to to D3, with all the corresponding side effects, wasn't and stil isn't handled properly. Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Add support for Secondary Controller ListLukasz Maniak5-10/+121
Introduce handling for Secondary Controller List (Identify command with CNS value of 15h). Secondary controller ids are unique in the subsystem, hence they are reserved by it upon initialization of the primary controller to the number of sriov_max_vfs. ID reservation requires the addition of an intermediate controller slot state, so the reserved controller has the address 0xFFFF. A secondary controller is in the reserved state when it has no virtual function assigned, but its primary controller is realized. Secondary controller reservations are released to NULL when its primary controller is unregistered. Signed-off-by: Lukasz Maniak <lukasz.maniak@linux.intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Add support for Primary Controller CapabilitiesLukasz Maniak3-5/+21
Implementation of Primary Controller Capabilities data structure (Identify command with CNS value of 14h). Currently, the command returns only ID of a primary controller. Handling of remaining fields are added in subsequent patches implementing virtualization enhancements. Signed-off-by: Lukasz Maniak <lukasz.maniak@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-23hw/nvme: Add support for SR-IOVLukasz Maniak2-4/+84
This patch implements initial support for Single Root I/O Virtualization on an NVMe device. Essentially, it allows to define the maximum number of virtual functions supported by the NVMe controller via sriov_max_vfs parameter. Passing a non-zero value to sriov_max_vfs triggers reporting of SR-IOV capability by a physical controller and ARI capability by both the physical and virtual function devices. NVMe controllers created via virtual functions mirror functionally the physical controller, which may not entirely be the case, thus consideration would be needed on the way to limit the capabilities of the VF. NVMe subsystem is required for the use of SR-IOV. Signed-off-by: Lukasz Maniak <lukasz.maniak@linux.intel.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-06-22aspeed/hace: Add missing newlines to unimp messagesJoel Stanley1-2/+2
Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed/i2c: Enable SLAVE_ADDR_RX_MATCH alwaysCédric Le Goater1-3/+10
There is no 'slave match interrupt' enable bit in the Interrupt Control Register. Consider it is always enabled and extend the mask value 'bus->regs[intr_ctrl_reg]' with the SLAVE_ADDR_RX_MATCH bit when the interrupt is raised. Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22hw/i2c/aspeed: add DEV_ADDR in old register modeKlaus Jensen1-2/+2
Add support for writing and reading the device address register in old register mode. On the AST2400 (only 1 slave address) * no upper bits On the AST2500 (2 possible slave addresses), * bit[31] : Slave Address match indicator * bit[30] : Slave Address Receiving pending On the AST2600 (3 possible slave addresses), * bit[31-30] : Slave Address match indicator * bit[29] : Slave Address Receiving pending The model could be more precise to take into account all fields but since the Linux driver is masking the register value being set, it should be fine. See commit 3fb2e2aeafb2 ("i2c: aspeed: disable additional device addresses on ast2[56]xx") from Zeiv. This can be addressed later. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> [ clg: add details to commit log ] Message-Id: <20220601210831.67259-3-its@irrelevant.dk> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22hw/i2c/aspeed: rework raise interrupt trace eventKlaus Jensen2-13/+23
Build a single string instead of having several parameters on the trace event. Suggested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> [ clg: simplified trace buffer creation ] Message-Id: <20220601210831.67259-2-its@irrelevant.dk> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: Add I2C buses to AST1030 modelTroy Lee2-0/+31
Instantiate the I2C buses in AST1030 model and create two slave device for ast1030-evb. Signed-off-by: Troy Lee <troy_lee@aspeedtech.com> Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com> Signed-off-by: Steven Lee <steven_lee@aspeedtech.com> Reviewed-by: Joel Stanley <joel@jms.id.au> [ clg : - adapted to current AST1030 upstream models - changed AST2600 to AST1030 in comment - fixed typo in commit log ] Message-Id: <20220324100439.478317-3-troy_lee@aspeedtech.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed/i2c: Add ast1030 controller modelsCédric Le Goater1-0/+24
Based on : https://lore.kernel.org/qemu-devel/20220324100439.478317-2-troy_lee@aspeedtech.com/ Cc: Troy Lee <troy_lee@aspeedtech.com> Cc: Jamin Lin <jamin_lin@aspeedtech.com> Cc: Steven Lee <steven_lee@aspeedtech.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: i2c: Move regs and helpers to header fileJoe Komlodi1-266/+0
Moves register definitions and short commonly used inlined functiosn to the header file to help tidy up the implementation file. Signed-off-by: Joe Komlodi <komlodi@google.com> Change-Id: I34dff7485b6bbe3c9482715ccd94dbd65dc5f324 Message-Id: <20220331043248.2237838-8-komlodi@google.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: i2c: Add PKT_DONE IRQ to traceJoe Komlodi2-1/+4
Signed-off-by: Joe Komlodi <komlodi@google.com> Change-Id: I566eb09f4b9016e24570572f367627f6594039f5 Message-Id: <20220331043248.2237838-7-komlodi@google.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: i2c: Add new mode supportJoe Komlodi1-194/+650
On AST2600, I2C has a secondary mode, called "new mode", which changes the layout of registers, adds some minor behavior changes, and introduces a new way to transfer data called "packet mode". Most of the bit positions of the fields are the same between old and new mode, so we use SHARED_FIELD_XX macros to reuse most of the code between the different modes. For packet mode, most of the command behavior is the same compared to other modes, but there are some minor changes to how interrupts are handled compared to other modes. Signed-off-by: Joe Komlodi <komlodi@google.com> Change-Id: I072f8301964f623afc74af1fe50c12e5caef199e Message-Id: <20220331043248.2237838-6-komlodi@google.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: i2c: Use reg array instead of individual varsJoe Komlodi1-155/+126
Using a register array will allow us to represent old-mode and new-mode I2C registers by using the same underlying register array, instead of adding an entire new set of variables to represent new mode. As part of this, we also do additional cleanup to use ARRAY_FIELD_ macros instead of FIELD_ macros on registers. Signed-off-by: Joe Komlodi <komlodi@google.com> Change-Id: Ib94996b17c361b8490c042b43c99d8abc69332e3 [ clg: use of memset in aspeed_i2c_bus_reset() ] Message-Id: <20220331043248.2237838-5-komlodi@google.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: i2c: Migrate to registerfields APIJoe Komlodi1-197/+196
This cleans up some of the field accessing, setting, and clearing bitwise operations, and wraps them in macros instead. Signed-off-by: Joe Komlodi <komlodi@google.com> Change-Id: I33018d6325fa04376e7c29dc4a49ab389a8e333a Message-Id: <20220331043248.2237838-4-komlodi@google.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-22aspeed: Remove fake RTC device on ast2500-evbCédric Le Goater1-4/+0
The board has no such device. It might have been useful for some tests in the past, it's not anymore and the same can be achieved on the command line. Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-06-20ppc/pnv: fix extra indent spaces with DEFINE_PROP*Daniel Henrique Barboza3-14/+14
The DEFINE_PROP* macros in pnv files are using extra spaces for no good reason. Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220602215351.149910-1-danielhb413@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-20pnv/xive2: Access direct mapped thread contexts from all chipsFrederic Barrat1-4/+14
When accessing a thread context through the IC BAR, the offset of the page in the BAR identifies the CPU. From that offset, we can compute the PIR (processor ID register) of the CPU to do the data structure lookup. On P10, the current code assumes an access for node 0 when computing the PIR. Everything is almost in place to allow access for other nodes though. So this patch reworks how the PIR value is computed so that we can access all thread contexts through the IC BAR. The PIR is already correct on P9, so no need to modify anything there. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220602165310.558810-1-fbarrat@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-20ppc: fix boot with sam460exMichael S. Tsirkin1-0/+8
Recent changes to pcie_host corrected size of its internal region to match what it expects: only the low 28 bits are ever decoded. Previous code just ignored bit 29 (if size was 1 << 29) in the address which does not make much sense. We are now asserting on size > 1 << 28 instead, but PPC 4xx actually allows guest to configure different sizes, and some firmwares seem to set it to 1 << 29. This caused e.g. qemu-system-ppc -M sam460ex to exit with an assert when the guest writes a value to CFGMSK register when trying to map config space. This is done in the board firmware in ppc4xx_init_pcie_port() in roms/u-boot-sam460ex/arch/powerpc/cpu/ppc4xx/4xx_pcie.c It's not clear what the proper fix should be but for now let's force the size to 256MB, so anything outside the expected address range is ignored. Fixes: commit 1f1a7b2269 ("include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX") Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220526224229.95183-1-mst@redhat.com> [danielhb: changed commit msg as BALATON Zoltan suggested] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-16Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Richard Henderson10-79/+899
into staging virtio,pc,pci: fixes,cleanups,features more CXL patches RSA support for crypto fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKrYLMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpwpwH/2IS+V7wS3q/XXPz1HndJLpUP/z+mkeu9W6+ # X1U9CJ+66Ag4eD5T/jzoN0JEjiTeET/3xM+PY5NYZCh6QTAmA7EfFZv99oNWpGd1 # +nyxOdaMDPSscOKjLfDziVTi/QYIZBtU6TeixL9whkipYCqmgbs5gXV8ynltmKyF # bIJVeaXm5yQLcCTGzKzdXf+HmTErpEGDCDHFjzrLVjICRDdekElGVwYTn+ycl7p7 # oLsWWVDgqo0p86BITlrHUXUrxTXF3wyg2B59cT7Ilbb3o+Fa2GsP+o9IXMuVoNNp # A+zrq1QZ49UO3XwkS03xDDioUQ1T/V0L4w9dEfaGvpY4Horv0HI= # =PvmT # -----END PGP SIGNATURE----- # gpg: Signature made Thu 16 Jun 2022 09:56:19 AM PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: acpi/erst: fix fallthrough code upon validation failure vhost: also check queue state in the vhost_dev_set_log error routine crypto: Introduce RSA algorithm virtio-iommu: Add an assert check in translate routine virtio-iommu: Use recursive lock to avoid deadlock virtio-iommu: Add bypass mode support to assigned device virtio/vhost-user: Fix wrong vhost notifier GPtrArray size docs/cxl: Add switch documentation pci-bridge/cxl_downstream: Add a CXL switch downstream port pci-bridge/cxl_upstream: Add a CXL switch upstream port Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-16acpi/erst: fix fallthrough code upon validation failureAni Sinha1-0/+3
At any step when any validation fail in check_erst_backend_storage(), there is no need to continue further through other validation checks. Further, by continuing even when record_size is 0, we run the risk of triggering a divide by zero error if we continued with other validation checks. Hence, we should simply return from this function upon validation failure. CC: Peter Maydell <peter.maydell@linaro.org> CC: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Ani Sinha <ani@anisinha.ca> Message-Id: <20220513141005.1929422-1-ani@anisinha.ca> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric DeVolder <eric.devolder@oracle.com>
2022-06-16vhost: also check queue state in the vhost_dev_set_log error routineNi Xun1-0/+4
When check queue state in the vhost_dev_set_log routine, it miss the error routine check, this patch also check queue state in error case. Fixes: 1e5a050f5798 ("check queue state in the vhost_dev_set_log routine") Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Zhigang Lu <tonnylu@tencent.com> Message-Id: <OS0PR01MB57139163F3F3955960675B52EAA79@OS0PR01MB5713.jpnprd01.prod.outlook.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16crypto: Introduce RSA algorithmzhenwei pi1-66/+257
There are two parts in this patch: 1, support akcipher service by cryptodev-builtin driver 2, virtio-crypto driver supports akcipher service In principle, we should separate this into two patches, to avoid compiling error, merge them into one. Then virtio-crypto gets request from guest side, and forwards the request to builtin driver to handle it. Test with a guest linux: 1, The self-test framework of crypto layer works fine in guest kernel 2, Test with Linux guest(with asym support), the following script test(note that pkey_XXX is supported only in a newer version of keyutils): - both public key & private key - create/close session - encrypt/decrypt/sign/verify basic driver operation - also test with kernel crypto layer(pkey add/query) All the cases work fine. Run script in guest: rm -rf *.der *.pem *.pfx modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m rm -rf /tmp/data dd if=/dev/random of=/tmp/data count=1 bs=20 openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -subj "/C=CN/ST=BJ/L=HD/O=qemu/OU=dev/CN=qemu/emailAddress=qemu@qemu.org" openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der openssl x509 -in cert.pem -inform PEM -outform DER -out cert.der PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s` echo "priv key id = "$PRIV_KEY_ID PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s` echo "pub key id = "$PUB_KEY_ID keyctl pkey_query $PRIV_KEY_ID 0 keyctl pkey_query $PUB_KEY_ID 0 echo "Enc with priv key..." keyctl pkey_encrypt $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.priv echo "Dec with pub key..." keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.priv enc=pkcs1 >/tmp/dec cmp /tmp/data /tmp/dec echo "Sign with priv key..." keyctl pkey_sign $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 hash=sha1 > /tmp/sig echo "Verify with pub key..." keyctl pkey_verify $PRIV_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1 echo "Enc with pub key..." keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.pub echo "Dec with priv key..." keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.pub enc=pkcs1 >/tmp/dec cmp /tmp/data /tmp/dec echo "Verify with pub key..." keyctl pkey_verify $PUB_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1 Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220611064243.24535-2-pizhenwei@bytedance.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16virtio-iommu: Add an assert check in translate routineZhenzhong Duan1-0/+4
With address space switch supported, dma access translation only happen after endpoint is attached to a non-bypass domain. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220613061010.2674054-4-zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16virtio-iommu: Use recursive lock to avoid deadlockZhenzhong Duan1-9/+11
When switching address space with mutex lock hold, mapping will be replayed for assigned device. This will trigger relock deadlock. Also release the mutex resource in unrealize routine. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220613061010.2674054-3-zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16virtio-iommu: Add bypass mode support to assigned deviceZhenzhong Duan2-2/+114
Currently assigned devices can not work in virtio-iommu bypass mode. Guest driver fails to probe the device due to DMA failure. And the reason is because of lacking GPA -> HPA mappings when VM is created. Add a root container memory region to hold both bypass memory region and iommu memory region, so the switch between them is supported just like the implementation in virtual VT-d. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220613061010.2674054-2-zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16virtio/vhost-user: Fix wrong vhost notifier GPtrArray sizeYajun Wu1-1/+1
In fetch_or_create_notifier, idx begins with 0. So the GPtrArray size should be idx + 1 and g_ptr_array_set_size should be called with idx + 1. This wrong GPtrArray size causes fetch_or_create_notifier return an invalid address. Passing this invalid pointer to vhost_user_host_notifier_remove causes assert fail: qemu/include/qemu/int128.h:27: int128_get64: Assertion `r == a' failed. shutting down, reason=crashed Backends like dpdk-vdpa which sends out vhost notifier requests almost always hit qemu crash. Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers") Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Parav Pandit <parav@nvidia.com> Change-Id: I87e0f7591ca9a59d210879b260704a2d9e9d6bcd Message-Id: <20220526034851.683258-1-yajunw@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2022-06-16pci-bridge/cxl_downstream: Add a CXL switch downstream portJonathan Cameron3-3/+291
Emulation of a simple CXL Switch downstream port. The Device ID has been allocated for this use. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16pci-bridge/cxl_upstream: Add a CXL switch upstream portJonathan Cameron2-1/+217
An initial simple upstream port emulation to allow the creation of CXL switches. The Device ID has been allocated for this use. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220616145126.8002-2-Jonathan.Cameron@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-16Merge tag 'pull-9p-20220616' of https://github.com/cschoenebeck/qemu into ↵Richard Henderson1-23/+40
staging 9pfs: fix 'Twalk' protocol violation Actual fix is patch 5, whereas patch 4 being preparatory, all other patches are test cases to guard this Twalk issue. # -----BEGIN PGP SIGNATURE----- # # iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmKrDSAXHHFlbXVfb3Nz # QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5VgoQ//bA/lXYa6hds4f73+opq7iiJ/ # 88gnJO8uPctNWXJ5f6ufXcTFtC99QRcl97jgSQhSIUdaZCfcpg7Pq3fONc060cMt # MNxi5Da31Fq7xz4UhSQHgWlgAfomfClYoBSOtrrxjVbXChA2rB7FXhD9aewimUtt # TlolXdJuPbGR3F6H0glN1itij12Ay5c0DMqFPy5npYlzjNhxmPb8QgFZ8E+lxhcT # hG+OMmS9O5Mk7WKYWC1Iij7tWm45RbThPEUsfCPt6jIJYQqheOQs0ohJG9wyCZu3 # JUCgSBPG1nNY0hgBJ/X7un7e89BoRw8edwqP+sSigfDf+LquUggqRFgz+joTbfvj # Prq+1NTDIckDRZF6CDUSkZE3+Gq3qlIhw/2vS+bjYZrk04lP4x8d9JYPSkT3i8xc # +YT/apDUkT68FjJ6PudfS2j6xRtYt86nOuWuhYukTZ2z5FJ0c9XAJlJX2ZS9Az3n # AqKFCT+8UE4VYKnAJ61xDdqvAdEmKJUi5YutfuwH+j6sS4peLX0gg8mGlNi7y8JK # bsqNjE1ve8rkp24DuUoHmivs/m1ogJi9Jxp5IjB4d26MPhgojrxOpaYUVg98QS7d # os2ES47CSn4KFxqsFMZnZpgzKxIvRQ4C9bBbSClDOffHWHRJub6PCw5F9eCTH4dO # z/QPJ+smDY7bolF+gSg= # =3ejn # -----END PGP SIGNATURE----- # gpg: Signature made Thu 16 Jun 2022 03:59:44 AM PDT # gpg: using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395 # gpg: issuer "qemu_oss@crudebyte.com" # gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: ECAB 1A45 4014 1413 BA38 4926 30DB 47C3 A012 D5F4 # Subkey fingerprint: 96D8 D110 CF7A F808 4F88 5901 34C2 B587 65A4 7395 * tag 'pull-9p-20220616' of https://github.com/cschoenebeck/qemu: tests/9pfs: check fid being unaffected in fs_walk_2nd_nonexistent tests/9pfs: guard recent 'Twalk' behaviour fix 9pfs: fix 'Twalk' to only send error if no component walked 9pfs: refactor 'name_idx' -> 'nwalked' in v9fs_walk() tests/9pfs: compare QIDs in fs_walk_none() test tests/9pfs: Twalk with nwname=0 tests/9pfs: walk to non-existent dir Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-16Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingRichard Henderson5-19/+16
* statistics subsystem * virtio reset cleanups * build system cleanups * fix Cirrus CI # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKpooQUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNlFwf+OugLGRZl3KVc7akQwUJe9gg2T31h # VkC+7Tei8FAwe8vDppVd+CYEIi0M3acxD2amRrv2etCCGSuySN1PbkfRcSfPBX01 # pRWpasdhfqnZR8Iidi7YW1Ou5CcGqKH49nunBhW10+osb/mu5sVscMuOJgTDj/lK # CpsmDyk6572yGmczjNLlmhYcTU36clHpAZgazZHwk1PU+B3fCKlYYyvUpT3ItJvd # cK92aIUWrfofl3yTy0k4IwvZwNjTBirlstOIomZ333xzSA+mm5TR+mTvGRTZ69+a # v+snpMp4ILDMoB5kxQ42kK5WpdiN//LnriA9CBFDtOidsDDn8kx7gJe2RA== # =Dxwa # -----END PGP SIGNATURE----- # gpg: Signature made Wed 15 Jun 2022 02:12:36 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (21 commits) build: include pc-bios/ part in the ROMS variable meson: put cross compiler info in a separate section q35:Enable TSEG only when G_SMRAME and TSEG_EN both enabled build: fix check for -fsanitize-coverage-allowlist tests/vm: allow running tests in an unconfigured source tree configure: cleanup -fno-pie detection configure: update list of preserved environment variables virtio-mmio: cleanup reset virtio: stop ioeventfd on reset virtio-mmio: stop ioeventfd on legacy reset s390x: simplify virtio_ccw_reset_virtio block: add more commands to preconfig mode hmp: add filtering of statistics by name qmp: add filtering of statistics by name hmp: add filtering of statistics by provider qmp: add filtering of statistics by provider hmp: add basic "info stats" implementation cutils: add functions for IEC and SI prefixes qmp: add filtering of statistics by target vCPU kvm: Support for querying fd-based stats ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-169pfs: fix 'Twalk' to only send error if no component walkedChristian Schoenebeck1-16/+33
Current implementation of 'Twalk' request handling always sends an 'Rerror' response if any error occured. The 9p2000 protocol spec says though: " If the first element cannot be walked for any reason, Rerror is returned. Otherwise, the walk will return an Rwalk message containing nwqid qids corresponding, in order, to the files that are visited by the nwqid successful elementwise walks; nwqid is therefore either nwname or the index of the first elementwise walk that failed. " http://ericvh.github.io/9p-rfc/rfc9p2000.html#anchor33 For that reason we are no longer leaving from an error path in function v9fs_walk(), unless really no path component could be walked successfully or if the request has been interrupted. Local variable 'nwalked' counts and reflects the number of path components successfully processed by background I/O thread, whereas local variable 'name_idx' subsequently counts and reflects the number of path components eventually accepted successfully by 9p server controller portion. New local variable 'any_err' is an aggregate variable reflecting whether any error occurred at all, while already existing variable 'err' only reflects the last error. Despite QIDs being delivered to client in a more relaxed way now, it is important to note though that fid still must remain unaffected if any error occurred. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <bc73e24258a75dc29458024c7936c8a036c3eac5.1647339025.git.qemu_oss@crudebyte.com>
2022-06-169pfs: refactor 'name_idx' -> 'nwalked' in v9fs_walk()Christian Schoenebeck1-8/+8
The local variable 'name_idx' is used in two loops in function v9fs_walk(). Let the first loop use its own variable 'nwalked' instead, which we will use in subsequent patch as the number of (requested) path components successfully walked by background I/O thread. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <d506308e7e343023c4db95d0e6053dd2627ed3c1.1647339025.git.qemu_oss@crudebyte.com>
2022-06-15vfio-user: handle reset of remote deviceJagannathan Raman1-0/+20
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 112eeadf3bc4c6cdb100bc3f9a6fcfc20b467c1b.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: handle device interruptsJagannathan Raman6-11/+268
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Message-id: 9523479eaafe050677f4de2af5dd0df18c27cfd9.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: handle PCI BAR accessesJagannathan Raman2-0/+193
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 3373e10b5be5f42846f0632d4382466e1698c505.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: handle DMA mappingsJagannathan Raman3-0/+62
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: faacbcd45c4d02c591f0dbfdc19041fbb3eae7eb.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: IOMMU support for remote deviceJagannathan Raman3-1/+144
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: afe0b0a97582cdad42b5b25636a29c523265a10a.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: handle PCI config space accessesJagannathan Raman2-0/+53
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: be9d2ccf9b1d24e50dcd9c23404dbf284142cec7.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: run vfio-user contextJagannathan Raman1-1/+117
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: e934b0090529d448b6a7972b21dfc3d7421ce494.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: find and init PCI deviceJagannathan Raman1-0/+67
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 7798dbd730099b33fdd00c4c202cfe79e5c5c151.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: instantiate vfio-user contextJagannathan Raman1-0/+82
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: a452871ac8c812ff96fc4f0ce6037f4769953fab.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15vfio-user: define vfio-user-server objectJagannathan Raman4-0/+241
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: e45a17001e9b38f451543a664ababdf860e5f2f2.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>