aboutsummaryrefslogtreecommitdiff
path: root/hw/mem
AgeCommit message (Collapse)AuthorFilesLines
2024-06-19hw/mem/memory-device: Remove legacy_align from memory_device_pre_plug()Philippe Mathieu-Daudé2-9/+5
'legacy_align' is always NULL, remove it, simplifying memory_device_pre_plug(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240617071118.60464-16-philmd@linaro.org>
2024-06-19hw/mem/pc-dimm: Remove legacy_align argument from pc_dimm_pre_plug()Philippe Mathieu-Daudé1-4/+2
'legacy_align' is always NULL, remove it. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240617071118.60464-15-philmd@linaro.org>
2024-04-25hw/cxl/cxl-cdat: Make cxl_doe_cdat_init() return booleanZhao Liu1-2/+1
As error.h suggested, the best practice for callee is to return something to indicate success / failure. With returned boolean, there's no need to dereference @errp to check failure case. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-ID: <20240418100433.1085447-4-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-04-18memory-device: move stubs out of stubs/Paolo Bonzini2-0/+23
Since the memory-device stubs are needed exactly when the Kconfig symbols are not needed, move them to hw/mem/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240408155330.522792-15-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-03-12hw/mem/cxl_type3: Fix missing ERRP_GUARD() in ct3_realize()Zhao Liu1-0/+1
As the comment in qapi/error, dereferencing @errp requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: * - It must not be dereferenced, because it may be null. ... * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. * * Using it when it's not needed is safe, but please avoid cluttering * the source with useless code. But in ct3_realize(), @errp is dereferenced without ERRP_GUARD(): cxl_doe_cdat_init(cxl_cstate, errp); if (*errp) { goto err_free_special_ops; } Here we check *errp, because cxl_doe_cdat_init() returns void. And ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the NULL @errp parameter, it hasn't triggered the bug that dereferencing the NULL @errp. To follow the requirement of @errp, add missing ERRP_GUARD() in ct3_realize(). Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240223085653.1255438-4-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-03-09hw/mem/cxl_type3: Fix problem with g_steal_pointer()Thomas Huth1-12/+12
When setting GLIB_VERSION_MAX_ALLOWED to GLIB_VERSION_2_58 or higher, glib adds type safety checks to the g_steal_pointer() macro. This triggers errors in the ct3_build_cdat_entries_for_mr() function which uses the g_steal_pointer() for type-casting from one pointer type to the other (which also looks quite weird since the local pointers have all been declared with g_autofree though they are never freed here). Fix it by using a proper typecast instead. For making this possible, we have to remove the QEMU_PACKED attribute from some structs since GCC otherwise complains that the source and destination pointer might have different alignment restrictions. Removing the QEMU_PACKED should be fine here since the structs are already naturally aligned. Anyway, add some QEMU_BUILD_BUG_ON() statements to make sure that we've got the right sizes (without padding in the structs). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-14hw/cxl: Standardize all references on CXL r3.1 and minor updatesJonathan Cameron1-3/+3
Previously not all references mentioned any spec version at all. Given r3.1 is the current specification available for evaluation at www.computeexpresslink.org update references to refer to that. Hopefully this won't become a never ending job. A few structure definitions have been updated to add new fields. Defaults of 0 and read only are valid choices for these new DVSEC registers so go with that for now. There are additional error codes and some of the 'questions' in the comments are resolved now. Update documentation reference to point to the CXL r3.1 specification with naming closer to what is on the cover. For cases where there are structure version numbers, add defines so they can be found next to the register definitions. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240126121636.24611-6-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-02-14hw/mem/cxl_type3: Fix potential divide by zero reported by coverityJonathan Cameron1-2/+7
Fixes Coverity ID 1522368. Currently error_fatal is set if interleave_ways_dec() is going to return 0 but we should handle that zero return explicitly. Reported-by: Stefan Hajnoczi <stefanha@gmail.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240126120132.24248-10-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-02-14hw/mem/cxl_type3: Drop handling of failure of g_malloc0() and g_malloc()Jonathan Cameron1-45/+7
As g_malloc0/g_malloc() will just exit QEMU on failure there is no point in checking for it failing. Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240126120132.24248-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-02-06memory-device: reintroduce memory region size checkDavid Hildenbrand1-0/+14
We used to check that the memory region size is multiples of the overall requested address alignment for the device memory address. We removed that check, because there are cases (i.e., hv-balloon) where devices unconditionally request an address alignment that has a very large alignment (i.e., 32 GiB), but the actual memory device size might not be multiples of that alignment. However, this change: (a) allows for some practically impossible DIMM sizes, like "1GB+1 byte". (b) allows for DIMMs that partially cover hugetlb pages, previously reported in [1]. Both scenarios don't make any sense: we might even waste memory. So let's reintroduce that check, but only check that the memory region size is multiples of the memory region alignment (i.e., page size, huge page size), but not any additional memory device requirements communicated using md->get_min_alignment(). The following examples now fail again as expected: (a) 1M with 2M THP qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-ram,id=mem1,size=1M \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x200000 (b) 1G+1byte qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-ram,id=mem1,size=1073741825B \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x200000 (c) Unliagned hugetlb size (2M) qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-file,id=mem1,mem-path=/dev/hugepages/tmp,size=511M \ -device pc-dimm,id=dimm1,memdev=mem1 backend memory size must be multiple of 0x200000 (d) Unliagned hugetlb size (1G) qemu-system-x86_64 -m 4g,maxmem=16g,slots=1 -S -nodefaults -nographic \ -object memory-backend-file,id=mem1,mem-path=/dev/hugepages1G/tmp,size=2047M \ -device pc-dimm,id=dimm1,memdev=mem1 -> backend memory size must be multiple of 0x40000000 Note that this fix depends on a hv-balloon change to communicate its additional alignment requirements using get_min_alignment() instead of through the memory region. [1] https://lkml.kernel.org/r/f77d641d500324525ac036fe1827b3070de75fc1.1701088320.git.mprivozn@redhat.com Message-ID: <20240117135554.787344-3-david@redhat.com> Reported-by: Zhenyu Zhang <zhenyzha@redhat.com> Reported-by: Michal Privoznik <mprivozn@redhat.com> Fixes: eb1b7c4bd413 ("memory-device: Drop size alignment check") Tested-by: Zhenyu Zhang <zhenyzha@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-12-31meson: remove CONFIG_ALLPaolo Bonzini1-1/+0
CONFIG_ALL is tricky to use and was ported over to Meson from the recursive processing of Makefile variables. Meson sourcesets however have all_sources() and all_dependencies() methods that remove the need for it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-11-15hw/mem/memory-device.c: spelling fix: ontainingMichael Tokarev1-1/+1
Fixes: 6c1b28e9e405 "memory-device: Support empty memory devices" Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-07Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Stefan Hajnoczi2-18/+50
into staging virtio,pc,pci: features, fixes virtio sound card support vhost-user: back-end state migration cxl: line length reduction enabling fabric management vhost-vdpa: shadow virtqueue hash calculation Support shadow virtqueue RSS Support tests: CPU topology related smbios test cases Fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01 # GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83 # nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68 # D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG # BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX # 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc= # =vEv+ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits) acpi/tests/avocado/bits: enable console logging from bits VM acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point hw/cxl: Add tunneled command support to mailbox for switch cci. hw/cxl: Add dummy security state get hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions hw/cxl/mbox: Add Get Background Operation Status Command hw/cxl: Add support for device sanitation hw/cxl/mbox: Wire up interrupts for background completion hw/cxl/mbox: Add support for background operations hw/cxl: Implement Physical Ports status retrieval hw/pci-bridge/cxl_downstream: Set default link width and link speed hw/cxl/mbox: Add Physical Switch Identify command. hw/cxl/mbox: Add Information and Status / Identify command hw/cxl: Add a switch mailbox CCI function hw/pci-bridge/cxl_upstream: Move defintion of device to header. hw/cxl/mbox: Generalize the CCI command processing hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState hw/cxl/mbox: Split mailbox command payload into separate input and output hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue. ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-07hw/cxl: Add tunneled command support to mailbox for switch cci.Jonathan Cameron1-0/+11
This implementation of tunneling makes the choice that our Type 3 device is a Logical Device (LD) of a Multi-Logical Device (MLD) that just happens to only have one LD for now. Tunneling is supported from a Switch Mailbox CCI (and shortly via MCTP over I2C connected to the switch MCTP CCI) via an outer level to the FM owned LD in the MLD Type 3 device. From there an inner tunnel may be used to access particular LDs. Protocol wise, the following is what happens in a real system but we don't emulate the transports - just the destinations and the payloads. ( Host -> Switch Mailbox CCI - in band FM-API mailbox command or Host -> Switch MCTP CCI - MCTP over I2C using the CXL FM-API MCTP Binding. ) then (if a tunnel command) Switch -> Type 3 FM Owned LD - MCTP over PCI VDM using the CXL FM-API binding (addressed by switch port) then (if unwrapped command also a tunnel command) Type 3 FM Owned LD to LD0 via internal transport (addressed by LD number) or (added shortly) Host to Type 3 FM Owned MCTP CCI - MCTP over I2C using the CXL FM-API MCTP Binding. then (if unwrapped comand is a tunnel comamnd) Type 3 FM Owned LD to LD0 via internal transport. (addressed by LD number) It is worth noting that the tunneling commands over PCI VDM presumably use the appropriate MCTP binding depending on opcode. This may be the CXL FMAPI binding or the CXL Memory Device Binding. Additional commands will need to be added to make this useful beyond testing the tunneling works. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20231023160806.13206-18-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-07hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functionsGregory Price1-4/+6
Call CXL_TYPE3 once at top of function to avoid multiple invocations. Signed-off-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20231023160806.13206-16-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-07hw/cxl: Add support for device sanitationDavidlohr Bueso1-0/+10
Make use of the background operations through the sanitize command, per CXL 3.0 specs. Traditionally run times can be rather long, depending on the size of the media. Estimate times based on: https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20231023160806.13206-14-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-07hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceStateJonathan Cameron1-2/+3
Enables having multiple CCIs per devices. Each CCI (mailbox) has it's own state and command list, so they can't share a single structure. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20231023160806.13206-4-Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-07hw/cxl: Line length reductionsJonathan Cameron2-14/+22
Michael Tsirkin observed that there were some unnecessarily long lines in the CXL code in a recent review. This patch is intended to rectify that where it does not hurt readability. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20231023140210.3089-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-06memory-device: Drop size alignment checkDavid Hildenbrand1-6/+0
There is no strong requirement that the size has to be multiples of the requested alignment, let's drop it. This is a preparation for hv-baloon. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2023-11-03memory-device: Support empty memory devicesDavid Hildenbrand1-3/+40
Let's support empty memory devices -- memory devices that don't have a memory device region in the current configuration. hv-balloon with an optional memdev is the primary use case. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2023-10-12memory-device,vhost: Support automatic decision on the number of memslotsDavid Hildenbrand1-3/+93
We want to support memory devices that can automatically decide how many memslots they will use. In the worst case, they have to use a single memslot. The target use cases are virtio-mem and the hyper-v balloon. Let's calculate a reasonable limit such a memory device may use, and instruct the device to make a decision based on that limit. Use a simple heuristic that considers: * A memslot soft-limit for all memory devices of 256; also, to not consume too many memslots -- which could harm performance. * Actually still free and unreserved memslots * The percentage of the remaining device memory region that memory device will occupy. Further, while we properly check before plugging a memory device whether there still is are free memslots, we have other memslot consumers (such as boot memory, PCI BARs) that don't perform any checks and might dynamically consume memslots without any prior reservation. So we might succeed in plugging a memory device, but once we dynamically map a PCI BAR we would be in trouble. Doing accounting / reservation / checks for all such users is problematic (e.g., sometimes we might temporarily split boot memory into two memslots, triggered by the BIOS). We use the historic magic memslot number of 509 as orientation to when supporting 256 memory devices -> memslots (leaving 253 for boot memory and other devices) has been proven to work reliable. We'll fallback to suggesting a single memslot if we don't have at least 509 total memslots. Plugging vhost devices with less than 509 memslots available while we have memory devices plugged that consume multiple memslots due to automatic decisions can be problematic. Most configurations might just fail due to "limit < used + reserved", however, it can also happen that these memory devices would suddenly consume memslots that would actually be required by other memslot consumers (boot, PCI BARs) later. Note that this has always been sketchy with vhost devices that support only a small number of memslots; but we don't want to make it any worse.So let's keep it simple and simply reject plugging such vhost devices in such a configuration. Eventually, all vhost devices that want to be fully compatible with such memory devices should support a decent number of memslots (>= 509). Message-ID: <20230926185738.277351-13-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12memory-device,vhost: Support memory devices that dynamically consume memslotsDavid Hildenbrand1-2/+27
We want to support memory devices that have a dynamically managed memory region container as device memory region. This device memory region maps multiple RAM memory subregions (e.g., aliases to the same RAM memory region), whereby these subregions can be (un)mapped on demand. Each RAM subregion will consume a memslot in KVM and vhost, resulting in such a new device consuming memslots dynamically, and initially usually 0. We already track the number of used vs. required memslots for all memslots. From that, we can derive the number of reserved memslots that must not be used otherwise. The target use case is virtio-mem and the hyper-v balloon, which will dynamically map aliases to RAM memory region into their device memory region container. Properly document what's supported and what's not and extend the vhost memslot check accordingly. Message-ID: <20230926185738.277351-10-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12memory-device: Track required and actually used memslots in DeviceMemoryStateDavid Hildenbrand1-0/+54
Let's track how many memslots are required by plugged memory devices and how many are currently actually getting used by plugged memory devices. "required - used" is the number of reserved memslots. For now, the number of used and required memslots is always equal, and there are no reservations. This is a preparation for memory devices that want to dynamically consume memslots after initially specifying how many they require -- where we'll end up with reserved memslots. To track the number of used memslots, create a new address space for our device memory and register a memory listener (add/remove) for that address space. Message-ID: <20230926185738.277351-9-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12memory-device: Support memory devices with multiple memslotsDavid Hildenbrand1-8/+19
We want to support memory devices that have a memory region container as device memory region that maps multiple RAM memory regions. Let's start by supporting memory devices that statically map multiple RAM memory regions and, thereby, consume multiple memslots. We already have one device that uses a container as device memory region: NVDIMMs. However, a NVDIMM always ends up consuming exactly one memslot. Let's add support for that by asking the memory device via a new callback how many memslots it requires. Message-ID: <20230926185738.277351-7-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12vhost: Return number of free memslotsDavid Hildenbrand1-1/+1
Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. This is a preparation for memory devices that consume multiple memslots. Message-ID: <20230926185738.277351-6-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12kvm: Return number of free memslotsDavid Hildenbrand1-1/+1
Let's return the number of free slots instead of only checking if there is a free slot. While at it, check all address spaces, which will also consider SMM under x86 correctly. This is a preparation for memory devices that consume multiple memslots. Message-ID: <20230926185738.277351-5-david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-04hw/cxl: Support 4 HDM decoders at all levels of topologyJonathan Cameron1-31/+65
Support these decoders in CXL host bridges (pxb-cxl), CXL Switch USP and CXL Type 3 end points. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230913132523.29780-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04hw/cxl: Fix and use same calculation for HDM decoder block size everywhereJonathan Cameron1-9/+15
In order to avoid having the size of the per HDM decoder register block repeated in lots of places, create the register definitions for HDM decoder 1 and use the offset between the first registers in HDM decoder 0 and HDM decoder 1 to establish the offset. Calculate in each function as this is more obvious and leads to shorter line lengths than a single #define which would need a long name to be specific enough. Note that the code currently only supports one decoder, so the bugs this fixes don't actually affect anything. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230913132523.29780-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-21hw/mem/cxl_type3: Add missing copyright and license noticeJonathan Cameron2-0/+21
This has been missing from the start. Assume it should match with cxl/cxl-component-utils.c as both were part of early postings from Ben. Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-21hw/other: spelling fixesMichael Tokarev1-3/+3
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-09-19nvdimm: Reject writing label data to ROM instead of crashing QEMUDavid Hildenbrand1-3/+7
Currently, when using a true R/O NVDIMM (ROM memory backend) with a label area, the VM can easily crash QEMU by trying to write to the label area, because the ROM memory is mmap'ed without PROT_WRITE. [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 -> QEMU segfaults Let's remember whether we have a ROM memory backend and properly reject the write request: [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 zeroed 0 nmem In comparison, on a system with a R/W NVDIMM: [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 zeroed 1 nmem For ACPI, just return "unsupported", like if no label exists. For spapr, return "H_P2", similar to when no label area exists. Could we rely on the "unarmed" property? Maybe, but it looks cleaner to only disallow what certainly cannot work. After all "unarmed=on" primarily means: cannot accept persistent writes. In theory, there might be setups where devices with "unarmed=on" set could be used to host non-persistent data (temporary files, system RAM, ...); for example, in Linux, admins can overwrite the "readonly" setting and still write to the device -- which will work as long as we're not using ROM. Allowing writing label data in such configurations can make sense. Message-ID: <20230906120503.359863-2-david@redhat.com> Fixes: dbd730e85987 ("nvdimm: check -object memory-backend-file, readonly=on option") Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12memory-device: Track used region size in DeviceMemoryStateDavid Hildenbrand1-19/+3
Let's avoid iterating over all devices and simply track it in the DeviceMemoryState. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-11-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12memory-device: Refactor memory_device_pre_plug()David Hildenbrand1-14/+14
Let's move memory_device_check_addable() and basic checks out of memory_device_get_free_addr() directly into memory_device_pre_plug(). Separating basic checks from address assignment is cleaner and prepares for further changes. As all memory device users now use memory_devices_init(), and that function enforces that the size is 0, we can drop the check for an empty region. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-10-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12memory-device: Introduce machine_memory_devices_init()David Hildenbrand1-0/+14
Let's intrduce a new helper that we will use to replace existing memory device setup code during machine initialization. We'll enforce that the size has to be > 0. Once all machines were converted, we'll only allocate ms->device_memory if the size > 0. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-3-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12memory-device: Unify enabled vs. supported error messagesDavid Hildenbrand1-9/+4
Let's unify the error messages, such that we can simply stop allocating ms->device_memory if the size would be 0 (and there are no memory devices ever). The case of "not supported by the machine" should barely pop up either way: if the machine doesn't support memory devices, it usually doesn't call the pre_plug handler ... Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-2-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-06-23hw/cxl/events: Add injection of Memory Module EventsJonathan Cameron2-0/+74
These events include a copy of the device health information at the time of the event. Actually using the emulated device health would require a lot of controls to manipulate that state. Given the aim of this injection code is to just test the flows when events occur, inject the contents of the device health state as well. Future work may add more sophisticate device health emulation including direct generation of these records when events occur (such as a temperature threshold being crossed). That does not reduce the usefulness of this more basic generation of the events. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-8-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl/events: Add injection of DRAM eventsJonathan Cameron2-0/+129
Defined in CXL r3.0 8.2.9.2.1.2 DRAM Event Record, this event provides information related to DRAM devices. Example injection command in QMP: { "execute": "cxl-inject-dram-event", "arguments": { "path": "/machine/peripheral/cxl-mem0", "log": "informational", "flags": 1, "dpa": 1000, "descriptor": 3, "type": 3, "transaction-type": 192, "channel": 3, "rank": 17, "nibble-mask": 37421234, "bank-group": 7, "bank": 11, "row": 2, "column": 77, "correction-mask": [33, 44, 55,66] }} Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-7-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl/events: Add injection of General Media EventsIra Weiny2-0/+121
To facilitate testing provide a QMP command to inject a general media event. The event can be added to the log specified. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-6-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl/events: Add event interrupt supportIra Weiny1-2/+2
Replace the stubbed out CXL Get/Set Event interrupt policy mailbox commands. Enable those commands to control interrupts for each of the event log types. Skip the standard input mailbox length on the Set command due to DCD being optional. Perform the checks separately. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl/events: Wire up get/clear event mailbox commandsIra Weiny1-0/+1
CXL testing is benefited from an artificial event log injection mechanism. Add an event log infrastructure to insert, get, and clear events from the various logs available on a device. Replace the stubbed out CXL Get/Clear Event mailbox commands with commands that operate on the new infrastructure. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl: Add clear poison mailbox command support.Jonathan Cameron1-0/+37
Current implementation is very simple so many of the corner cases do not exist (e.g. fragmenting larger poison list entries) Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230526170010.574-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22hw/cxl: QMP based poison injection supportJonathan Cameron2-0/+62
Inject poison using QMP command cxl-inject-poison to add an entry to the poison list. For now, the poison is not returned CXL.mem reads, but only via the mailbox command Get Poison List. So a normal memory read to an address that is on the poison list will not yet result in a synchronous exception (and similar for partial cacheline writes). That is left for a future patch. See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h) Kernel patches to use this interface here: https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/ To inject poison using QMP (telnet to the QMP port) { "execute": "qmp_capabilities" } { "execute": "cxl-inject-poison", "arguments": { "path": "/machine/peripheral/cxl-pmem0", "start": 2048, "length": 256 } } Adjusted to select a device on your machine. Note that the poison list supported is kept short enough to avoid the complexity of state machine that is needed to handle the MORE flag. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230526170010.574-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-20meson: Replace softmmu_ss -> system_ssPhilippe Mathieu-Daudé1-4/+4
We use the user_ss[] array to hold the user emulation sources, and the softmmu_ss[] array to hold the system emulation ones. Hold the latter in the 'system_ss[]' array for parity with user emulation. Mechanical change doing: $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-19hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)Gregory Price1-73/+223
This commit enables each CXL Type-3 device to contain one volatile memory region and one persistent region. Two new properties have been added to cxl-type3 device initialization: [volatile-memdev] and [persistent-memdev] The existing [memdev] property has been deprecated and will default the memory region to a persistent memory region (although a user may assign the region to a ram or file backed region). It cannot be used in combination with the new [persistent-memdev] property. Partitioning volatile memory from persistent memory is not yet supported. Volatile memory is mapped at DPA(0x0), while Persistent memory is mapped at DPA(vmem->size), per CXL Spec 8.2.9.8.2.0 - Get Partition Info. Signed-off-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230421160827.2227-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19hw/mem: Use memory_region_size() in cxl_type3Jonathan Cameron1-4/+4
Accessors prefered over direct use of int128_get64() as they clamp out of range values. None are expected here but cleaner to always use the accessor than mix and match. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230421160827.2227-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gregory Price <gregory.price@memverge.com>
2023-05-19hw/cxl: Fix incorrect reset of commit and associated clearing of committed.Jonathan Cameron1-1/+20
The hardware clearing the commit bit is not spec compliant. Clearing of committed bit when commit is cleared is not specifically stated in the CXL spec, but is the expected (and simplest) permitted behaviour so use that for QEMU emulation. Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> -- v2: Picked up tags. Message-Id: <20230421135906.3515-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19hw/cxl: Fix endian handling for decoder commit.Jonathan Cameron1-3/+6
Not a real problem yet as all supported architectures are little endian, but continue to tidy these up when touching code for other reasons. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230421135906.3515-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19hw/cxl: cdat: Fix failure to free buffer in erorr pathsJonathan Cameron1-0/+4
The failure paths in CDAT file loading did not clear up properly. Change to using g_auto_free and a local pointer for the buffer to ensure this function has no side effects on error. Also drop some unnecessary checks that can not fail. Cleanup properly after a failure to load a CDAT file. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230421132020.7408-3-Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21virtio-balloon: optimize the virtio-balloon on the ARM platformYangming1-0/+7
Optimize the virtio-balloon feature on the ARM platform by adding a variable to keep track of the current hot-plugged pc-dimm size, instead of traversing the virtual machine's memory modules to count the current RAM size during the balloon inflation or deflation process. This variable can be updated only when plugging or unplugging the device, which will result in an increase of approximately 60% efficiency of balloon process on the ARM platform. We tested the total amount of time required for the balloon inflation process on ARM: inflate the balloon to 64GB of a 128GB guest under stress. Before: 102 seconds After: 42 seconds Signed-off-by: Qi Xi <xiqi2@huawei.com> Signed-off-by: Ming Yang yangming73@huawei.com Message-Id: <e13bc78f96774bfab4576814c293aa52@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
2023-03-22*: Add missing includes of qemu/error-report.hRichard Henderson2-0/+2
This had been pulled in via qemu/plugin.h from hw/core/cpu.h, but that will be removed. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org> [AJB: add various additional cases shown by CI] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org> Reviewed-by: Emilio Cota <cota@braap.org>