aboutsummaryrefslogtreecommitdiff
path: root/hw/nvme
AgeCommit message (Collapse)AuthorFilesLines
8 daystreewide: fix paths for relocated files in commentsSean Wei1-1/+1
After the docs directory restructuring, several comments refer to paths that no longer exist. Replace these references to the current file locations so readers can find the correct files. Related commits --------------- 189c099f75f (Jul 2021) docs: collect the disparate device emulation docs into one section Rename docs/system/{ => devices}/nvme.rst 5f4c96b779f (Feb 2023) docs/system/loongarch: update loongson3.rst and rename it to virt.rst Rename docs/system/loongarch/{loongson3.rst => virt.rst} fe0007f3c1d (Sep 2023) exec: Rename cpu.c -> cpu-target.c Rename cpus-common.c => cpu-common.c 42fa9665e59 (Apr 2025) exec: Restrict 'cpu_ldst.h' to accel/tcg/ Rename include/{exec/cpu_ldst.h => accel/tcg/cpu-ldst.h} Signed-off-by: Sean Wei <me@sean.taipei> Message-ID: <20250616.qemu.relocated.06@sean.taipei> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-06-12hw/nvme/ctrl: skip automatic zero-init of large arraysDaniel P. Berrangé1-3/+3
The 'nvme_map_sgl' method has a 256 element array used for copying data from the device. Skip the automatic zero-init of this array to eliminate the performance overhead in the I/O hot path. The 'segment' array will be fully initialized when reading data from the device. The 'nme_changed_nslist' method has a 4k byte array that is manually initialized with memset(). The compiler ought to be intelligent enough to turn the memset() into a static initialization operation, and thus not duplicate the automatic zero-init. Replacing memset() with '{}' makes it unambiguous that the array is statically initialized. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Message-id: 20250610123709.835102-24-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-05-15hw/nvme: fix nvme hotpluggingKlaus Jensen1-1/+0
Commit cd59f50ab017 caused a regression on nvme hotplugging for devices with an implicit nvm subsystem. The nvme-subsys device was incorrectly left with being marked as non-hotpluggable. Fix this. Cc: qemu-stable@nongnu.org Reported-by: Stéphane Graber <stgraber@stgraber.org> Tested-by: Stéphane Graber <stgraber@stgraber.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2950 Fixes: cd59f50ab017 ("hw/nvme: always initialize a subsystem") Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-04-25qom: Make InterfaceInfo[] uses constPhilippe Mathieu-Daudé1-1/+1
Mechanical change using: $ sed -i -E 's/\(InterfaceInfo.?\[/\(const InterfaceInfo\[/g' \ $(git grep -lE '\(InterfaceInfo.?\[\]\)') Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20250424194905.82506-7-philmd@linaro.org>
2025-04-25qom: Have class_init() take a const data argumentPhilippe Mathieu-Daudé3-3/+3
Mechanical change using gsed, then style manually adapted to pass checkpatch.pl script. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-4-philmd@linaro.org>
2025-04-08hw/nvme: fix attachment of private namespacesKlaus Jensen4-9/+14
Fix regression when attaching private namespaces that gets attached to the wrong controller. Keep track of the original controller "owner" of private namespaces, and only attach if this matches on controller enablement. Fixes: 6ccca4b6bb9f ("hw/nvme: rework csi handling") Reported-by: Alan Adamson <alan.adamson@oracle.com> Suggested-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Message-ID: <20250408-fix-private-ns-v1-1-28e169b6b60b@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-03-07Merge tag 'pull-vfio-20250306' of https://github.com/legoater/qemu into stagingStefan Hajnoczi1-2/+1
vfio queue: * Added property documentation * Added Minor fixes * Implemented basic PCI PM capability backing * Promoted new IGD maintainer * Deprecated vfio-plaform * Extended VFIO migration with multifd support # -----BEGIN PGP SIGNATURE----- # # iQIyBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmfJrZoACgkQUaNDx8/7 # 7KFE2A/0Dmief9u/dDJIKGIDa+iawcf4hu8iX4v5pB0DlGniT3rgK8WMGnhDpPxq # Q4wsKfo+JJ2q6msInrT7Ckqyydu9nQztI3vwmfMuWxLhTMyH28K96ptwPqIZBjOx # rPTEXfnVX4W3tpn1+48S+vefWVa/gkBkIvv7RpK18rMBXv1kDeyOvc/d2dbAt7ft # zJc4f8gH3jfQzGwmnYVZU1yPrZN7p6zhYR/AD3RQOY97swgZIEyYxXhOuTPiCuEC # zC+2AMKi9nmnCG6x/mnk7l2yJXSlv7lJdqcjYZhJ9EOIYfiUGTREYIgQbARcafE/ # 4KSg2QR35BoUd4YrmEWxXJCRf3XnyWXDY36dDKVhC0OHng1F/U44HuL4QxwoTIay # s1SP/DHcvDiPAewVTvdgt7Iwfn9xGhcQO2pkrxBoNLB5JYwW+R6mG7WXeDv1o3GT # QosTu1fXZezQqFd4v6+q5iRNS2KtBZLTspwAmVdywEFUs+ZLBRlC+bodYlinZw6B # Yl/z0LfAEh4J55QmX2espbp8MH1+mALuW2H2tgSGSrTBX1nwxZFI5veFzPepgF2S # eTx69BMjiNMwzIjq1T7e9NpDCceiW0fXDu7IK1MzYhqg1nM9lX9AidhFTeiF2DB2 # EPb3ljy/8fyxcPKa1T9X47hQaSjbMwofaO8Snoh0q0jokY246Q== # =hIBw # -----END PGP SIGNATURE----- # gpg: Signature made Thu 06 Mar 2025 22:13:46 HKT # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full] # gpg: aka "Cédric Le Goater <clg@kaod.org>" [full] # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20250306' of https://github.com/legoater/qemu: (42 commits) hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property vfio/migration: Make x-migration-multifd-transfer VFIO property mutable vfio/migration: Add x-migration-multifd-transfer VFIO property vfio/migration: Multifd device state transfer support - send side vfio/migration: Multifd device state transfer support - config loading support migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile vfio/migration: Multifd device state transfer support - load thread vfio/migration: Multifd device state transfer support - received buffers queuing vfio/migration: Setup and cleanup multifd transfer in these general methods vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd vfio/migration: Multifd device state transfer - add support checking function vfio/migration: Multifd device state transfer support - basic types vfio/migration: Move migration channel flags to vfio-common.h header file vfio/migration: Add vfio_add_bytes_transferred() vfio/migration: Convert bytes_transferred counter to atomic vfio/migration: Add load_device_config_state_start trace event migration: Add save_live_complete_precopy_thread handler migration/multifd: Add multifd_device_state_supported() migration/multifd: Make MultiFDSendData a struct migration/multifd: Device state transfer support - send side ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-06qdev: Rename PropertyInfo member @name to @typeMarkus Armbruster1-1/+1
PropertyInfo member @name becomes ObjectProperty member @type, while Property member @name becomes ObjectProperty member @name. Rename the former. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20250227085601.4140852-4-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [One missed instance of @type fixed]
2025-03-06pci: Use PCI PM capability initializerAlex Williamson1-2/+1
Switch callers directly initializing the PCI PM capability with pci_add_capability() to use pci_pm_init(). Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com> Cc: Akihiko Odaki <akihiko.odaki@daynix.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Stefan Weil <sw@weilnetz.de> Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Klaus Jensen <its@irrelevant.dk> Cc: Jesper Devantier <foss@defmacro.it> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-3-alex.williamson@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-26hw/nvme: remove nvme_aio_err()Klaus Jensen1-37/+23
nvme_rw_complete_cb() is the only remaining user of nvme_aio_err(), so open code the status code setting instead. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26hw/nvme: set error status code explicitly for misc commandsKlaus Jensen1-6/+22
The nvme_aio_err() does not handle Verify, Compare, Copy and other misc commands and defaults to setting the error status code to Internal Device Error. For some of these commands, we know better, so set it explicitly. For the commands using the nvme_misc_cb() callback (Copy, Flush, ...), if no status code has explicitly been set by the lower handlers, default to Internal Device Error as previously. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26hw/nvme: only set command abort requested when cancelled due to AbortKlaus Jensen1-4/+2
The Command Abort Requested status code should only be set if the command was explicitly cancelled due to an Abort command. Or, in the case the cancel was due to Submission Queue deletion, set the status code to Command Aborted due to SQ Deletion. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26hw/nvme: rework csi handlingKlaus Jensen3-107/+125
The controller incorrectly allows a zoned namespace to be attached even if CS.CSS is configured to only support the NVM command set for I/O queues. Rework handling of namespace command sets in general by attaching supported namespaces when the controller is started instead of, like now, statically when realized. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25hw/nvme: be compliant wrt. dsm processing limitsKlaus Jensen1-9/+15
The specification states that, > The controller shall set all three processing limit fields (i.e., the > DMRL, DMRSL and DMSL fields) to non-zero values or shall clear all > three processing limit fields to 0h. So, set the DMRL and DMSL fields in addition to DMRSL. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25nvme: fix iocs status code valuesKlaus Jensen1-2/+2
The status codes related to I/O Command Sets are in the wrong group. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25hw/nvme: add knob for doorbell buffer config supportKlaus Jensen2-3/+9
Add a 'dbcs' knob to allow Doorbell Buffer Config command to be disabled. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25hw/nvme: make oacs dynamicKlaus Jensen2-7/+22
Virtualization Management needs sriov-related parameters. Only report support for the command when that conditions are true. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25hw/nvme: always initialize a subsystemKlaus Jensen2-58/+42
If no nvme-subsys is explicitly configured, instantiate one. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25hw/nvme: Add OCP SMART / Health Information Extended Log PageStephen Bates2-0/+60
The Open Compute Project [1] includes a Datacenter NVMe SSD Specification [2]. The most recent version of this specification (as of November 2024) is 2.6.1. This specification layers on top of the NVM Express specifications [3] to provide additional functionality. A key part of of this is the 512 Byte OCP SMART / Health Information Extended log page that is defined in Section 4.8.6 of the specification. We add a controller argument (ocp) that toggles on/off the SMART log extended structure. To accommodate different vendor specific specifications like OCP, we add a multiplexing function (nvme_vendor_specific_log) which will route to the different log functions based on arguments and log ids. We only return the OCP extended SMART log when the command is 0xC0 and ocp has been turned on in the nvme argumants. Though we add the whole nvme SMART log extended structure, we only populate the physical_media_units_{read,written}, log_page_version and log_page_uuid. This patch is based on work done by Joel but has been modified enough that he requested a co-developed-by tag rather than a signed-off-by. [1]: https://www.opencompute.org/ [2]: https://www.opencompute.org/documents/datacenter-nvme-ssd-specification-v2-6-1-pdf [3]: https://nvmexpress.org/specifications/ Signed-off-by: Stephen Bates <sbates@raithlin.com> Co-developed-by: Joel Granados <j.granados@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-20pcie_sriov: Ensure VF addr does not overflowAkihiko Odaki1-8/+14
pci_new() aborts when creating a VF with addr >= PCI_DEVFN_MAX. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250116-reuse-v20-7-7cb370606368@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-12-21Merge tag 'exec-20241220' of https://github.com/philmd/qemu into stagingStefan Hajnoczi3-7/+7
Accel & Exec patch queue - Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander) - Add '-d invalid_mem' logging option (Zoltan) - Create QOM containers explicitly (Peter) - Rename sysemu/ -> system/ (Philippe) - Re-orderning of include/exec/ headers (Philippe) Move a lot of declarations from these legacy mixed bag headers: . "exec/cpu-all.h" . "exec/cpu-common.h" . "exec/cpu-defs.h" . "exec/exec-all.h" . "exec/translate-all" to these more specific ones: . "exec/page-protection.h" . "exec/translation-block.h" . "user/cpu_loop.h" . "user/guest-host.h" . "user/page-protection.h" # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe/// # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo= # =cjz8 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits) util/qemu-timer: fix indentation meson: Do not define CONFIG_DEVICES on user emulation system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header system/numa: Remove unnecessary 'exec/cpu-common.h' header hw/xen: Remove unnecessary 'exec/cpu-common.h' header target/mips: Drop left-over comment about Jazz machine target/mips: Remove tswap() calls in semihosting uhi_fstat_cb() target/xtensa: Remove tswap() calls in semihosting simcall() helper accel/tcg: Un-inline translator_is_same_page() accel/tcg: Include missing 'exec/translation-block.h' header accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h' accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h' qemu/coroutine: Include missing 'qemu/atomic.h' header exec/translation-block: Include missing 'qemu/atomic.h' header accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h' exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined target/sparc: Move sparc_restore_state_to_opc() to cpu.c target/sparc: Uninline cpu_get_tb_cpu_state() target/loongarch: Declare loongarch_cpu_dump_state() locally user: Move various declarations out of 'exec/exec-all.h' ... Conflicts: hw/char/riscv_htif.c hw/intc/riscv_aplic.c target/s390x/cpu.c Apply sysemu header path changes to not in the pull request. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-20include: Rename sysemu/ -> system/Philippe Mathieu-Daudé3-7/+7
Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-19Constify all opaque Property pointersRichard Henderson1-2/+2
Via sed "s/ Property [*]/ const Property */". The opaque pointers passed to ObjectProperty callbacks are the last instances of non-const Property pointers in the tree. For the most part, these callbacks only use object_field_prop_ptr, which now takes a const pointer itself. This logically should have accompanied d36f165d952 which allowed const Property to be registered. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-25-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LISTRichard Henderson3-3/+0
Now that all of the Property arrays are counted, we can remove the terminator object from each array. Update the assertions in device_class_set_props to match. With struct Property being 88 bytes, this was a rather large form of terminator. Saves 30k from qemu-system-aarch64. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-15hw/nvme: Constify all PropertyRichard Henderson3-3/+3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-03hw/nvme: take a reference on the subsystem on vf realizationKlaus Jensen1-0/+7
Make sure we grab a reference on the subsystem when a VF is realized. Otherwise, the subsytem will be unrealized automatically when the VFs are unregistered and unreffed. This fixes a latent bug but was not exposed until commit 08f632848008 ("pcie: Release references of virtual functions"). This was then fixed (or rather, hidden) by commit c613ad25125b ("pcie_sriov: Do not manually unrealize"), but that was then reverted (due to other issues) in commit b0fdaee5d1ed, exposing the bug yet again. Cc: qemu-stable@nongnu.org Fixes: 08f632848008 ("pcie: Release references of virtual functions") Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03hw/nvme: SR-IOV VFs must hardwire pci interrupt pin register to zeroKlaus Jensen1-1/+7
The PCI Interrupt Pin Register does not apply to VFs and MUST be hardwired to zero. Fixes: 44c2c09488db ("hw/nvme: Add support for SR-IOV") Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03hw/nvme: fix use/unuse of msix vectorsKlaus Jensen1-2/+3
Only call msix_{un,}use_vector() when interrupts are actually enabled for a completion queue. Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03hw/nvme: fix msix_uninit with exclusive barKlaus Jensen1-1/+6
Commit fa905f65c554 introduced a machine compatibility parameter to enable an exclusive bar for msix. It failed to account for this when cleaning up. Make sure that if an exclusive bar is enabled, we use the proper cleanup routine. Cc: qemu-stable@nongnu.org Fixes: fa905f65c554 ("hw/nvme: add machine compatibility parameter to enable msix exclusive bar") Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-08hw/nvme: fix handling of over-committed queuesKlaus Jensen1-9/+12
If a host chooses to use the SQHD "hint" in the CQE to know if there is room in the submission queue for additional commands, it may result in a situation where there are not enough internal resources (struct NvmeRequest) available to process the command. For a lack of a better term, the host may "over-commit" the device (i.e., it may have more inflight commands than the queue size). For example, assume a queue with N entries. The host submits N commands and all are picked up for processing, advancing the head and emptying the queue. Regardless of which of these N commands complete first, the SQHD field of that CQE will indicate to the host that the queue is empty, which allows the host to issue N commands again. However, if the device has not posted CQEs for all the previous commands yet, the device will have less than N resources available to process the commands, so queue processing is suspended. And here lies an 11 year latent bug. In the absense of any additional tail updates on the submission queue, we never schedule the processing bottom-half again unless we observe a head update on an associated full completion queue. This has been sufficient to handle N-to-1 SQ/CQ setups (in the absense of over-commit of course). Incidentially, that "kick all associated SQs" mechanism can now be killed since we now just schedule queue processing when we return a processing resource to a non-empty submission queue, which happens to cover both edge cases. However, we must retain kicking the CQ if it was previously full. So, apparently, no previous driver tested with hw/nvme has ever used SQHD (e.g., neither the Linux NVMe driver or SPDK uses it). But then OSv shows up with the driver that actually does. I salute you. Fixes: f3c507adcd7b ("NVMe: Initial commit for new storage interface") Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2388 Reported-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04hw/nvme: remove dead codeArun Kumar1-5/+0
Remove dead code which always returns success, since PRCHK will have a value of zero. Signed-off-by: Arun Kumar <arun.kka@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20241022222105.3609223-1-arun.kka@samsung.com Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04hw/nvme: add NPDAL/NPDGLAyush Mishra1-1/+4
Add the NPDGL and NPDAL fields to support large alignment and granularities. Signed-off-by: Ayush Mishra <ayush.m55@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20241001012833.3551820-1-ayush.m55@samsung.com [k.jensen: renamed the enum values] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04hw/nvme: i/o cmd set independent namespace data structureArun Kumar4-1/+38
Add support for the I/O Command Set Independent Namespace Data Structure (CNS 8h and 1fh). Signed-off-by: Arun Kumar <arun.kka@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20240925004407.3521406-1-arun.kka@samsung.com Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-10-01hw/nvme: add atomic write supportAlan Adamson2-1/+169
Adds support for the controller atomic parameters: AWUN and AWUPF. Atomic Compare and Write Unit (ACWU) is not currently supported. Writes that adhere to the ACWU and AWUPF parameters are guaranteed to be atomic. New NVMe QEMU Parameters (See NVMe Specification for details): atomic.dn (default off) - Set the value of Disable Normal. atomic.awun=UINT16 (default: 0) atomic.awupf=UINT16 (default: 0) By default (Disable Normal set to zero), the maximum atomic write size is set to the AWUN value. If Disable Normal is set, the maximum atomic write size is set to AWUPF. Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-10-01hw/nvme: add knob for CTRATT.MEMKlaus Jensen2-1/+10
Add a boolean prop (ctratt.mem) for setting CTRATT.MEM and default it to unset (false) to keep existing behavior of the device intact. Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Arun Kumar <arun.kka@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30hw/nvme: support CTRATT.MEMArun Kumar1-5/+14
Indicate that 'MDTS and Size Limits Exclude Metadata (MEM)' in the Controller Attributes (CTRATT) I/O Command Set Independent Identify Controller Data Structure. Signed-off-by: Arun Kumar <arun.kka@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> [k.jensen: updated commit message] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30hw/nvme: clear masked events from the aer queueArun Kumar1-2/+9
Clear masked events from the aer queue when get log page is issued with RAE 0 without checking for the presence of outstanding aer requests. Signed-off-by: Arun Kumar <arun.kka@samsung.com> [k.jensen: remove unnecessary QTAILQ_EMPTY check] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30hw/nvme: report id controller metadata sgl supportKeith Busch1-1/+2
The controller already supports this decoding, so just set the ID_CTRL.SGLS field accordingly. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-24hw/nvme: replace assert(false) with g_assert_not_reached()Pierrick Bouvier1-4/+4
This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-ID: <20240919044641.386068-11-pierrick.bouvier@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-13hw: Use device_class_set_legacy_reset() instead of opencodingPeter Maydell1-1/+1
Use device_class_set_legacy_reset() instead of opencoding an assignment to DeviceClass::reset. This change was produced with: spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/device-reset.cocci \ --keep-comments --smpl-spacing --in-place --dir hw Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org
2024-08-20hw/nvme: fix leak of uninitialized memory in io_mgmt_recvKlaus Jensen1-1/+1
Yutaro Shimizu from the Cyber Defense Institute discovered a bug in the NVMe emulation that leaks contents of an uninitialized heap buffer if subsystem and FDP emulation are enabled. Cc: qemu-stable@nongnu.org Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-08-01Revert "pcie_sriov: Ensure VF function number does not overflow"Michael S. Tsirkin1-16/+8
This reverts commit 77718701157f6ca77ea7a57b536fa0a22f676082. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-24Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Richard Henderson1-0/+62
into staging virtio,pci,pc: features,fixes pci: Initial support for SPDM Responders cxl: Add support for scan media, feature commands, device patrol scrub control, DDR5 ECS control, firmware updates virtio: in-order support virtio-net: support for SR-IOV emulation (note: known issues on s390, might get reverted if not fixed) smbios: memory device size is now configurable per Machine cpu: architecture agnostic code to support vCPU Hotplug Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH # zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx # yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS # wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md # VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+ # M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s= # =k8e9 # -----END PGP SIGNATURE----- # gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits) hw/nvme: Add SPDM over DOE support backends: Initial support for SPDM socket support hw/pci: Add all Data Object Types defined in PCIe r6.0 tests/acpi: Add expected ACPI AML files for RISC-V tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V tests/acpi: Add empty ACPI data files for RISC-V tests/qtest/bios-tables-test.c: Remove the fall back path tests/acpi: update expected DSDT blob for aarch64 and microvm acpi/gpex: Create PCI link devices outside PCI root bridge tests/acpi: Allow DSDT acpi table changes for aarch64 hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain hw/vfio/common: Add vfio_listener_region_del_iommu trace event virtio-iommu: Remove the end point on detach virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices virtio-iommu: Remove probe_done Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged" gdbstub: Add helper function to unregister GDB register space physmem: Add helper function to destroy CPU AddressSpace ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-22hw/nvme: Add SPDM over DOE supportWilfred Mallawa1-0/+62
Setup Data Object Exchange (DOE) as an extended capability for the NVME controller and connect SPDM to it (CMA) to it. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20240703092027.644758-4-alistair.francis@wdc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/nvme: remove useless type castYao Xingtao1-1/+1
The type of req->cmd is NvmeCmd, cast the pointer of this type to NvmeCmd* is useless. Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22hw/nvme: actually implement abortAyush Mishra1-0/+32
Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O. Signed-off-by: Ayush Mishra <ayush.m55@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22hw/nvme: add cross namespace copy supportArun Kumar1-92/+263
Extend copy command to copy user data across different namespaces via support for specifying a namespace for each source range Signed-off-by: Arun Kumar <arun.kka@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22hw/nvme: fix memory leak in nvme_dsmZheyu Ma1-0/+1
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This happens because the allocated memory for iocb->range is not freed in all error handling paths. Fix this by adding a free to ensure that the allocated memory is properly freed. ASAN log: ==3075137==ERROR: LeakSanitizer: detected memory leaks Direct leak of 480 byte(s) in 6 object(s) allocated from: #0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12 #3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30 #4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16 #5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29 Cc: qemu-stable@nongnu.org Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation") Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-11hw/nvme: Expand VI/VQ resource to uint32Minwoo Im2-6/+6
VI and VQ resources cover queue resources in each VFs in SR-IOV. Current maximum I/O queue pair size is 0xffff, we can expand them to cover the full number of I/O queue pairs. This patch also fixed Identify Secondary Controller List overflow due to expand of number of secondary controllers. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-11hw/nvme: Allocate sec-ctrl-list as a dynamic arrayMinwoo Im3-10/+5
To prevent further bumping up the number of maximum VF te support, this patch allocates a dynamic array (NvmeCtrl *)->sec_ctrl_list based on number of VF supported by sriov_max_vfs property. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>