Age | Commit message (Collapse) | Author | Files | Lines |
|
During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
check if dirty tracking has been started in order to avoid errors. The
current logic checks if migration is in ACTIVE or DEVICE states to
ensure dirty tracking has been started.
However, recently there has been an effort to simplify the migration
status API and reduce it to a single migration_is_running() function.
To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
it won't use migration_is_active() and migration_is_device(). Instead,
use internal VFIO dirty tracking flags.
As a side effect, now that migration status is no longer used to detect
dirty tracking status, VFIO log syncs are untied from migration. This
will make calc-dirty-rate more accurate as now it will also include VFIO
dirty pages.
While at it, as VFIODevice->dirty_tracking is now used to detect dirty
tracking status, add a comment that states how it's protected.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Tested-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20241218134022.21264-3-avihaih@nvidia.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Add a flag to VFIOContainerBase that indicates whether dirty tracking
has been started for the container or not.
This will be used in the following patches to allow dirty page syncs
only if dirty tracking has been started.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Tested-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20241218134022.21264-2-avihaih@nvidia.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
DSM region is likely to store framebuffer in Windows, a small DSM
region may cause display issues (e.g. half of the screen is black).
Since 971ca22f041b ("vfio/igd: don't set stolen memory size to zero"),
the x-igd-gms option was functionally removed, QEMU uses host's
original value, which is determined by DVMT Pre-Allocated option in
Intel FSP of host bios.
However, some vendors do not expose this config item to users. In
such cases, x-igd-gms option can be used to manually set the data
stolen memory size for guest. So this commit brings this option back,
keeping its old behavior. When it is not specified, QEMU uses host's
value.
When DVMT Pre-Allocated option is available in host BIOS, user should
set DSM region size there instead of using x-igd-gms option.
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-11-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
A recent commit in i915 driver [1] claims the BDSM register at 0x1080c0
of mmio bar0 has been there since gen 6. Mirror this register to the 32
bit BDSM register at 0x5c in pci config space for gen6-10 devices.
[1] https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-7-ville.syrjala@linux.intel.com
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-10-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
The GGC register at 0x50 of pci config space is a mirror of the same
register at 0x108040 of mmio bar0 [1]. i915 driver also reads that
register from mmio bar0 instead of config space. As GGC is programmed
and emulated by qemu, the mmio address should also be emulated, in the
same way of BDSM register.
[1] 4.1.28, 12th Generation Intel Core Processors Datasheet Volume 2
https://www.intel.com/content/www/us/en/content-details/655259
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-9-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
igd devices have multipe registers mirroring mmio address and pci
config space, more than a single BDSM register. To support this,
the read/write functions are made common and a macro is defined to
simplify the declaration of MemoryRegionOps.
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-8-tomitamoeko@gmail.com
[ clg : Fixed conversion specifier on 32-bit platform ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
All gen 11 and 12 igd devices have 64 bit BDSM register at 0xC0 in its
config space, add them to the list to support igd passthrough on Alder/
Raptor/Rocket/Ice/Jasper Lake platforms.
Tested legacy mode of igd passthrough works properly on both linux and
windows guests with AlderLake-S GT1 (8086:4680).
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-7-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Both Gemini Lake and Comet Lake are gen 9 devices. Many user reports
on internet shows legacy mode of igd passthrough works as qemu treats
them as gen 8 devices by default before e433f208973f ("vfio/igd:
return an invalid generation for unknown devices").
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-6-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Add helper functions igd_gtt_memory_size() and igd_stolen_size() for
calculating GTT stolen memory and Data stolen memory size in bytes,
and use macros to replace the hardware-related magic numbers for
better readability.
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-5-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Define the igd device generations according to i915 kernel driver to
avoid confusion, and adjust comment placement to clearly reflect the
relationship between ids and devices.
The condition of how GTT stolen memory size is calculated is changed
accordingly as GGMS is in multiple of 2 starting from gen 8.
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-4-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Since e433f208973f ("vfio/igd: return an invalid generation for unknown
devices"), the default return of igd_gen() was changed to unsupported.
There is no need to filter out those unsupported devices.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Link: https://lore.kernel.org/r/20241206122749.9893-3-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
On gen 8 and later devices, the GTT stolen memory size when GGMS equals
0 is 0 (no preallocated memory) rather than 1MB [1].
[1] 3.1.13, 5th Generation Intel Core Processor Family Datasheet Vol. 2
https://www.intel.com/content/www/us/en/content-details/330835
Fixes: c4c45e943e51 ("vfio/pci: Intel graphics legacy mode assignment")
Reported-By: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-2-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
staging
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Fixup handling of virtio-mem unplug during system resets, as
preparation for s390x support (especially kdump in the Linux guest)
- virtio-mem support for s390x
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmdnFD4RHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1rWBBAAp7WkYaNAjRy1PgpjNZ3z1gUJc/vk+skJ
# xVgGodA8txrJOFpNrbTyfhrdLs2TV4oWDvB/zrZRRtuxvur3O1EhFd9k6EqXuydr
# 0FunvLvVJwRHfEZycjN4aacQMRH3CJw07OaTzexeSl5UR/6w5PRofwUK4HX7W/Ka
# arqomGa3OJrs1+WgkV0Qcn4vh9HLRVv3iNC2Xo4W1wOCr1Du9zSPn9oC7zOQ0EO4
# ZC//7QsdkNRjUX/yMXMkhlSXx3b/RmRg2DBrxo7BZXg27VwGu4uHxL4LRBZiB2A7
# V9MqFOcVKzPMkXKTRjrgZ0vXQx1MPJ6WprEihMzMpYU6DrpA7KN/l8Ca8H24B2ln
# h7+bmkDsHVVcWovE9ii/9cMRfws6uWXXg3KoA8RQ8IbX1tU02lblw2uHhXEzcoge
# npqp/Z5LAiKVMetEnNnLH5thjut5PAEjuqD00cmZAMy4DNngLX2bGSdzMeVBkDMa
# 78ehLGRplm3t7ibUfaZaMKe6UD9tFrcD6XKsvUTXXHNbYO8ynbx58WOxSZmY98zU
# n3JNQRqtXYjBVlH3Dqm47vOTZHgOzFv3raa8BmSLpcBDeTXCTcUIl20s77dGw/vT
# r5YNCMN7O4YPFKUoRK9604QTgw6qlYaRTQlJD09usprGqVylb6gQtfZZuZkYDMp8
# sEI77QHsePA=
# =HDxr
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 21 Dec 2024 14:17:18 EST
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu:
s390x: virtio-mem support
s390x/virtio-ccw: add support for virtio based memory devices
s390x: remember the maximum page size
s390x/pv: prepare for memory devices
s390x/s390-virtio-ccw: prepare for memory devices
s390x/s390-skeys: prepare for memory devices
s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts
s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT
s390x: introduce s390_get_memory_limit()
s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code
s390x: rename s390-virtio-hcall* to s390-hypercall*
s390x/s390-virtio-hcall: prepare for more diag500 hypercalls
s390x/s390-virtio-hcall: remove hypercall registration mechanism
s390x/s390-virtio-ccw: don't crash on weird RAM sizes
virtio-mem: unplug memory only during system resets, not device resets
Conflicts:
- hw/s390x/s390-stattrib-kvm.c
sysemu/ -> system/ header rename conflict.
- hw/s390x/virtio-ccw-mem.c
Make Property array const and removed DEFINE_PROP_END_OF_LIST() to
conform to the latest conventions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Let's add our virtio-mem-ccw proxy device and wire it up. We should
be supporting everything (e.g., device unplug, "dynamic-memslots") that
we already support for the virtio-pci variant.
With a Linux guest that supports virtio-mem (and has automatic memory
onlining properly configured) the following example will work:
1. Start a VM with 4G initial memory and a virtio-mem device with a maximum
capacity of 16GB:
qemu/build/qemu-system-s390x \
--enable-kvm \
-m 4G,maxmem=20G \
-nographic \
-smp 8 \
-hda Fedora-Server-KVM-40-1.14.s390x.qcow2 \
-chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
-mon chardev=monitor,mode=readline \
-object memory-backend-ram,id=mem0,size=16G,reserve=off \
-device virtio-mem-ccw,id=vmem0,memdev=mem0,dynamic-memslots=on
2. Query the current size of virtio-mem device:
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
memaddr: 0x100000000
node: 0
requested-size: 0
size: 0
max-size: 17179869184
block-size: 1048576
memdev: /objects/mem0
3. Request to grow it to 8GB (hotplug 8GB):
(qemu) qom-set vmem0 requested-size 8G
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
memaddr: 0x100000000
node: 0
requested-size: 8589934592
size: 8589934592
max-size: 17179869184
block-size: 1048576
memdev: /objects/mem0
4. Request to grow to 16GB (hotplug another 8GB):
(qemu) qom-set vmem0 requested-size 16G
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
memaddr: 0x100000000
node: 0
requested-size: 17179869184
size: 17179869184
max-size: 17179869184
block-size: 1048576
memdev: /objects/mem0
5. Try to hotunplug all memory again, shrinking to 0GB:
(qemu) qom-set vmem0 requested-size 0G
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
memaddr: 0x100000000
node: 0
requested-size: 0
size: 0
max-size: 17179869184
block-size: 1048576
memdev: /objects/mem0
6. If it worked, unplug the device
(qemu) device_del vmem0
(qemu) info memory-devices
(qemu) object_del mem0
7. Hotplug a new device with a smaller capacity and directly size it to 1GB
(qemu) object_add memory-backend-ram,id=mem0,size=8G,reserve=off
(qemu) device_add virtio-mem-ccw,id=vmem0,memdev=mem0,\
dynamic-memslots=on,requested-size=1G
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
memaddr: 0x100000000
node: 0
requested-size: 1073741824
size: 1073741824
max-size: 8589934592
block-size: 1048576
memdev: /objects/mem0
Trying to use a virtio-mem device backed by hugetlb into a !hugetlb VM
correctly results in the error:
... Memory device uses a bigger page size than initial memory
Note that the virtio-mem driver in Linux will supports 1 MiB (pageblock)
granularity.
Message-ID: <20241219144115.2820241-15-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Accel & Exec patch queue
- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
Move a lot of declarations from these legacy mixed bag headers:
. "exec/cpu-all.h"
. "exec/cpu-common.h"
. "exec/cpu-defs.h"
. "exec/exec-all.h"
. "exec/translate-all"
to these more specific ones:
. "exec/page-protection.h"
. "exec/translation-block.h"
. "user/cpu_loop.h"
. "user/guest-host.h"
. "user/page-protection.h"
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
# wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
# KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
# A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
# 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
# 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
# xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
# VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
# ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
# 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
# +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
# x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
# =cjz8
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
util/qemu-timer: fix indentation
meson: Do not define CONFIG_DEVICES on user emulation
system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
system/numa: Remove unnecessary 'exec/cpu-common.h' header
hw/xen: Remove unnecessary 'exec/cpu-common.h' header
target/mips: Drop left-over comment about Jazz machine
target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
target/xtensa: Remove tswap() calls in semihosting simcall() helper
accel/tcg: Un-inline translator_is_same_page()
accel/tcg: Include missing 'exec/translation-block.h' header
accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
qemu/coroutine: Include missing 'qemu/atomic.h' header
exec/translation-block: Include missing 'qemu/atomic.h' header
accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
target/sparc: Move sparc_restore_state_to_opc() to cpu.c
target/sparc: Uninline cpu_get_tb_cpu_state()
target/loongarch: Declare loongarch_cpu_dump_state() locally
user: Move various declarations out of 'exec/exec-all.h'
...
Conflicts:
hw/char/riscv_htif.c
hw/intc/riscv_aplic.c
target/s390x/cpu.c
Apply sysemu header path changes to not in the pull request.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Let's implement support for abstract virtio based memory devices, using
the virtio-pci implementation as an orientation. Wire them up in the
machine hotplug handler, taking care of s390x page size limitations.
As we neither support virtio-mem or virtio-pmem yet, the code is
effectively unused. We'll implement support for virtio-mem based on this
next.
Note that we won't wire up the virtio-pci variant (should currently be
impossible due to lack of support for MSI-X), but we'll add a safety net
to reject plugging them in the pre-plug handler.
Message-ID: <20241219144115.2820241-14-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Let's remember the value (successfully) set via s390_set_max_pagesize().
This will be helpful to reject hotplugged memory devices that would exceed
this initially set page size.
Handle it just like how we handle s390_get_memory_limit(), storing it in
the machine, and moving the handling to machine code.
Message-ID: <20241219144115.2820241-13-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Let's prepare our address space for memory devices if enabled via
"maxmem" and if we have CONFIG_MEM_DEVICE enabled at all. Note that
CONFIG_MEM_DEVICE will be selected automatically once we add support
for devices.
Just like on other architectures, the region container for memory devices
is placed directly above our initial memory. For now, we only align the
start address of the region up to 1 GiB, but we won't add any additional
space to the region for internal alignment purposes; this can be done in
the future if really required.
The RAM size returned via SCLP is not modified, as this only
covers initial RAM (and standby memory we don't implement) and not memory
devices; clarify that in the docs of read_SCP_info(). Existing OSes without
support for memory devices will keep working as is, even when memory
devices would be attached the VM.
Guest OSs which support memory devices, such as virtio-mem, will
consult diag500(), to find out the maximum possible pfn. Guest OSes that
don't support memory devices, don't have to be changed and will continue
relying on information provided by SCLP.
There are no remaining maxram_size users in s390x code, and the remaining
ram_size users only care about initial RAM:
* hw/s390x/ipl.c
* hw/s390x/s390-hypercall.c
* hw/s390x/sclp.c
* target/s390x/kvm/pv.c
Message-ID: <20241219144115.2820241-11-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
With memory devices, we will have storage keys for memory that
exceeds the initial ram size.
The TODO already states that current handling is subopimal,
but we won't worry about improving that (TCG-only) thing for now.
Message-ID: <20241219144115.2820241-10-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
With memory devices, we will have storage attributes for memory that
exceeds the initial ram size. Further, we can easily have memory holes,
for which there (currently) are no storage attributes.
In particular, with memory holes, KVM_S390_SET_CMMA_BITS will fail to set
some storage attributes.
So let's do it like we handle storage keys migration, relying on
guest_phys_blocks_append(). However, in contrast to storage key
migration, we will handle it on the migration destination.
This is a preparation for virtio-mem support. Note that ever since the
"early migration" feature was added (x-early-migration), the state
of device blocks (plugged/unplugged) is migrated early such that
guest_phys_blocks_append() will properly consider all currently plugged
memory blocks and skip any unplugged ones.
In the future, we should try getting rid of the large temporary buffer
and also not send any attributes for any memory holes, just so they
get ignored on the destination.
Message-ID: <20241219144115.2820241-9-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
A guest OS that supports memory hotplug / memory devices must during
boot be aware of the maximum possible physical memory address that it might
have to handle at a later stage during its runtime.
For example, the maximum possible memory address might be required to
prepare the kernel virtual address space accordingly (e.g., select page
table hierarchy depth).
On s390x there is currently no such mechanism that is compatible with
paravirtualized memory devices, because the whole SCLP interface was
designed around the idea of "storage increments" and "standby memory".
Paravirtualized memory devices we want to support, such as virtio-mem, have
no intersection with any of that, but could co-exist with them in the
future if ever needed.
In particular, a guest OS must never detect and use device memory
without the help of a proper device driver. Device memory must not be
exposed in any firmware-provided memory map (SCLP or diag260 on s390x).
For this reason, these memory devices will be places in memory *above*
the "maximum storage increment" exposed via SCLP.
Let's provide a new diag500 subcode to query the memory limit determined in
s390_memory_init().
Message-ID: <20241219144115.2820241-8-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Let's add s390_get_memory_limit(), to query what has been successfully
set via s390_set_memory_limit(). Allow setting the limit only once.
We'll remember the limit in the machine state. Move
s390_set_memory_limit() to machine code, merging it into
set_memory_limit(), because this really is a machine property.
Message-ID: <20241219144115.2820241-7-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
machine code
Nowadays, it feels more natural to have that code located in
s390_memory_init(), where we also have direct access to the machine
object.
While at it, use the actual RAM size, not the maximum RAM size which
cannot currently be reached without support for any memory devices.
Consequently update s390_pv_vm_try_disable_async() to rely on the RAM size
as well, to avoid temporary issues while we further rework that
handling.
set_memory_limit() is temporary, we'll merge it with
s390_set_memory_limit() next.
Message-ID: <20241219144115.2820241-6-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Let's make it clearer that we are talking about general
QEMU/KVM-specific hypercalls.
Message-ID: <20241219144115.2820241-5-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Let's generalize, abstracting the virtio bits. diag500 is now a generic
hypercall to handle QEMU/KVM specific things. Explicitly specify all
already defined subcodes, including legacy ones (so we know what we can
use for new hypercalls).
Move the PGM_SPECIFICATION injection into the renamed function
handle_diag_500(), so we can turn it into a void function.
We'll rename the files separately, so git properly detects the rename.
Message-ID: <20241219144115.2820241-4-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Nowadays, we only have a single machine type in QEMU, everything is based
on virtio-ccw and the traditional virtio machine does no longer exist. No
need to dynamically register diag500 handlers. Move the two existing
handlers into s390-virtio-hcall.c.
Message-ID: <20241219144115.2820241-3-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
KVM is not happy when starting a VM with weird RAM sizes:
# qemu-system-s390x --enable-kvm --nographic -m 1234K
qemu-system-s390x: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION
failed, slot=0, start=0x0, size=0x244000: Invalid argument
kvm_set_phys_mem: error registering slot: Invalid argument
Aborted (core dumped)
Let's handle that in a better way by rejecting such weird RAM sizes
right from the start:
# qemu-system-s390x --enable-kvm --nographic -m 1234K
qemu-system-s390x: ram size must be multiples of 1 MiB
Message-ID: <20241219144115.2820241-2-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
We recently converted from the LegacyReset to the new reset framework
in commit c009a311e939 ("virtio-mem: Use new Resettable framework instead
of LegacyReset") to be able to use the ResetType to filter out wakeup
resets.
However, this change had an undesired implications: as we override the
Resettable interface methods in VirtIOMEMClass, the reset handler will
not only get called during system resets (i.e., qemu_devices_reset())
but also during any direct or indirect device rests (e.g.,
device_cold_reset()).
Further, we might now receive two reset callbacks during
qemu_devices_reset(), first when reset by a parent and later when reset
directly.
The memory state of virtio-mem devices is rather special: it's supposed to
be persistent/unchanged during most resets (similar to resetting a hard
disk will not destroy the data), unless actually cold-resetting the whole
system (different to a hard disk where a reboot will not destroy the data):
ripping out system RAM is something guest OSes don't particularly enjoy,
but we want to detect when rebooting to an OS that does not support
virtio-mem and wouldn't be able to detect+use the memory -- and we want
to force-defragment hotplugged memory to also shrink the usable device
memory region. So we rally want to catch system resets to do that.
On supported targets (e.g., x86), getting a cold reset on the
device/parent triggers is not that easy (but looks like PCI code
might trigger it), so this implication went unnoticed.
However, with upcoming s390x support it is problematic: during
kdump, s390x triggers a subsystem reset, ending up in
s390_machine_reset() and calling only subsystem_reset() instead of
qemu_devices_reset() -- because it's not a full system reset.
In subsystem_reset(), s390x performs a device_cold_reset() of any
TYPE_VIRTUAL_CSS_BRIDGE device, which ends up resetting all children,
including the virtio-mem device. Consequently, we wrongly detect a system
reset and unplug all device memory, resulting in hotplugged memory not
getting included in the crash dump -- undesired.
We really must not mess with hotplugged memory state during simple
device resets. To fix, create+register a new reset object that will only
get triggered during qemu_devices_reset() calls, but not during any other
resets as it is logically not the child of any other object.
Message-ID: <20241025104103.342188-1-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juraj Marcin <jmarcin@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
https://github.com/alistair23/qemu into staging
RISC-V PR for 10.0
* Correct the validness check of iova
* Fix APLIC in_clrip and clripnum write emulation
* Support riscv-iommu-sys device
* Add Tenstorrent Ascalon CPU
* Add AIA userspace irqchip_split support
* Add Microblaze V generic board
* Upgrade ACPI SPCR table to support SPCR table revision 4 format
* Remove tswap64() calls from HTIF
* Support 64-bit address of initrd
* Introduce svukte ISA extension
* Support ssstateen extension
* Support for RV64 Xiangshan Nanhu CPU
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmdkzjgACgkQr3yVEwxT
# gBOcyA//e0XhAQciQglCZZCfINdOyI8qSh+P2K0qtrXZ4VERHEMp7UoD5CQr2cZv
# h8ij1EkatXCwukVELx0rNckxG33bEFgG1oESnQSrwGE0Iu4csNW24nK5WlUS0/r+
# A5oD2wtzEF+cbhTKrVSDBN/PvlnWTKGEoJRkuXWfz5d4uR9eyQhfED0S2j36lNEC
# X1x/OZoKM89XuXtOFe9g55Z5UNzAatcdTISozL0FydiPh7QeVjTLHh28/tt559MX
# 7v5aJFlQuZ78z1mIHkZmPSorSrJ0zqhkP6NWe1ae06oMgzwRQQhYLppDILV4ZgUF
# 3mSDRoXmBycQXiYNPcHep3LdXfvxr+PpWHSevx8gH1jwm93On7Y/H7Uol6TDXzfC
# mrFjalfV5tzrD90ZvB+s5bCMF1q5Z8Dlj0pYF9aN9P1ILoWy3dndFAPJB6uKKDP7
# Qd4qOQ3dVyHAX9jLmVkB6QvAV/vTDrYTsAxaF/EaoLOy0IoKhjTvgda3XzE1MFKA
# gVafLluADIfSEdqa2QR2ExL8d1SZVoiObCp5TMLRer0HIpg/vQZwjfdbo4BgQKL3
# 7Q6wBxcZUNqrFgspXjm5WFIrdk2rfS/79OmvpNM6SZaK6BnklntdJHJHtAWujGsm
# EM310AUFpHMp2h6Nqnemb3qr5l4d20KSt8DhoPAUq1IE59Kb8XY=
# =0iQW
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 20:54:00 EST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20241220' of https://github.com/alistair23/qemu: (39 commits)
target/riscv: add support for RV64 Xiangshan Nanhu CPU
target/riscv: add ssstateen
target/riscv/tcg: hide warn for named feats when disabling via priv_ver
target/riscv: Include missing headers in 'internals.h'
target/riscv: Include missing headers in 'vector_internals.h'
target/riscv: Check svukte is not enabled in RV32
target/riscv: Expose svukte ISA extension
target/riscv: Check memory access to meet svukte rule
target/riscv: Support hstatus[HUKTE] bit when svukte extension is enabled
target/riscv: Support senvcfg[UKTE] bit when svukte extension is enabled
target/riscv: Add svukte extension capability variable
hw/riscv: Add the checking if DTB overlaps to kernel or initrd
hw/riscv: Add a new struct RISCVBootInfo
hw/riscv: Support to load DTB after 3GB memory on 64-bit system.
hw/char/riscv_htif: Clarify MemoryRegionOps expect 32-bit accesses
hw/char/riscv_htif: Explicit little-endian implementation
MAINTAINERS: Cover RISC-V HTIF interface
tests/qtest/bios-tables-test: Update virt SPCR golden reference for RISC-V
hw/acpi: Upgrade ACPI SPCR table to support SPCR table revision 4 format
qtest: allow SPCR acpi table changes
...
Conflicts:
target/riscv/cpu.c
Merge conflict with DEFINE_PROP_END_OF_LIST() removal. No Property
array terminator is needed anymore.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
* qdev: second part of Property cleanups
* rust: second part of QOM rework
* rust: callbacks wrapper
* rust: pl011 bugfixes
* kvm: cleanup errors in kvm_convert_memory()
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmdkaEkUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0/wgAgIJg8BrlRKfmiz14NZfph8/jarSj
# TOWYVxL2v4q98KBuL5pta2ucObgzwqyqSyc02S2DGSOIMQCIiBB5MaCk1iMjx+BO
# pmVU8gNlD8faO8SSmnnr+jDQt+G+bQ/nRgQJOAReF8oVw3O2aC/FaVKpitMzWtvv
# PLnJWdrqqpGq14OzX8iNCzSujxppAuyjrhT4lNlekzDoDfdTez72r+rXkvg4GzZL
# QC3xLYg/LrT8Rs+zgOhm/AaIyS4bOyMlkU9Du1rQ6Tyne45ey2FCwKVzBKrJdGcw
# sVbzEclxseLenoTbZqYK6JTzLdDoThVUbY2JwoCGUaIm+74P4NjEsUsTVg==
# =TuQM
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 13:39:05 EST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
rust: pl011: simplify handling of the FIFO enabled bit in LCR
rust: pl011: fix migration stream
rust: pl011: extend registers to 32 bits
rust: pl011: fix break errors and definition of Data struct
rust: pl011: always use reset() method on registers
rust: pl011: match break logic of C version
rust: pl011: fix declaration of LineControl bits
target/i386: Reset TSCs of parked vCPUs too on VM reset
kvm: consistently return 0/-errno from kvm_convert_memory
rust: qemu-api: add a module to wrap functions and zero-sized closures
rust: qom: add initial subset of methods on Object
rust: qom: add casting functionality
rust: tests: allow writing more than one test
bql: add a "mock" BQL for Rust unit tests
rust: re-export C types from qemu-api submodules
rust: rename qemu-api modules to follow C code a bit more
rust: qom: add possibility of overriding unparent
rust: qom: put class_init together from multiple ClassInitImpl<>
Constify all opaque Property pointers
hw/core/qdev-properties: Constify Property argument to PropertyInfo.print
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
"exec/confidential-guest-support.h" is specific to system
emulation, so move it under the system/ namespace.
Mechanical change doing:
$ sed -i \
-e 's,exec/confidential-guest-support.h,sysemu/confidential-guest-support.h,' \
$(git grep -l exec/confidential-guest-support.h)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-2-philmd@linaro.org>
|
|
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.
Files renamed manually then mechanical change using sed tool.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
|
|
"system/confidential-guest-support.h" is not needed,
remove it. Reorder #ifdef'ry to reduce declarations
exposed on user emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-3-philmd@linaro.org>
|
|
Always explicitly create QEMU system containers upfront.
Root containers will be created when trying to fetch the root object the
1st time. They are:
/objects
/chardevs
/backend
Machine sub-containers will be created only until machine is being
initialized. They are:
/machine/unattached
/machine/peripheral
/machine/peripheral-anon
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-8-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
QEMU will start to not rely on implicit creations of containers soon. Make
PPC drc devices follow by explicitly create the container whenever a drc
device is realized, dropping container_get() calls.
No functional change intended.
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-7-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
container_get() is going to become strict on not allowing to return a
non-container.
Switch the e500 user to use object_resolve_path_component() explicitly.
Cc: Bharat Bhushan <r65777@freescale.com>
Cc: qemu-ppc@nongnu.org
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20241121192202.4155849-6-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Provide a macro for the container type across QEMU source tree, rather than
hard code it every time.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20241121192202.4155849-2-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
DTB is placed to the end of memory, so we will check if the start
address of DTB overlaps to the address of kernel/initrd.
Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-4-jim.shu@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Add a new struct RISCVBootInfo to sync boot information between multiple
boot functions.
Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-3-jim.shu@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Larger initrd image will overlap the DTB at 3GB address. Since 64-bit
system doesn't have 32-bit addressable issue, we just load DTB to the end
of dram in 64-bit system.
Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-2-jim.shu@sifive.com>
[ Changes by AF
- Store fdt_load_addr_hi32 in the reset vector
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Looking at htif_mm_ops[] read/write handlers, we notice they
expect 32-bit values to accumulate into to the 'fromhost' and
'tohost' 64-bit variables. Explicit by setting the .impl
min/max fields.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241129154304.34946-4-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Since our RISC-V system emulation is only built for little
endian, the HTIF device aims to interface with little endian
memory accesses, thus we can explicit htif_mm_ops:endianness
being DEVICE_LITTLE_ENDIAN.
In that case tswap64() is equivalent to le64_to_cpu(), as in
"convert this 64-bit little-endian value into host cpu order".
Replace to simplify.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241129154304.34946-3-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Update the SPCR table to accommodate the SPCR Table revision 4 [1].
The SPCR table has been modified to adhere to the revision 4 format [2].
[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/serports/serial-port-console-redirection-table
[2]: https://github.com/acpica/acpica/pull/931
Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20241028015744.624943-3-jeeheng.sia@starfivetech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Add a basic board with interrupt controller (intc), timer, serial
(uartlite), small memory called LMB@0 (128kB) and DDR@0x80000000
(configured via command line eg. -m 2g).
This is basic configuration which matches HW generated out of AMD Vivado
(design tools). But initial configuration is going beyond what it is
configured by default because validation should be done on other
configurations too. That's why wire also additional uart16500, axi
ethernet(with axi dma).
GPIOs, i2c and qspi is also listed for completeness.
IRQ map is: (addr)
0 - timer (0x41c00000)
1 - uartlite (0x40600000)
2 - i2c (0x40800000)
3 - qspi (0x44a00000)
4 - uart16550 (0x44a10000)
5 - emaclite (0x40e00000)
6 - timer2 (0x41c10000)
7 - axi emac (0x40c00000)
8 - axi dma (0x41e00000)
9 - axi dma
10 - gpio (0x40000000)
11 - gpio2 (0x40010000)
12 - gpio3 (0x40020000)
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
Signed-off-by: Michal Simek <michal.simek@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241125134739.18189-1-sai.pavan.boddu@amd.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
The last step to enable KVM AIA aplic-imsic with irqchip in split mode
is to deal with how MSIs are going to be sent. In our current design we
don't allow an APLIC controller to send MSIs unless it's on m-mode. And
we also do not allow Supervisor MSI address configuration via the
'smsiaddrcfg' and 'smsiaddrcfgh' registers unless it's also a m-mode
APLIC controller.
Add a new RISCVACPLICState attribute called 'kvm_msicfgaddr'. This
attribute represents the base configuration address for MSIs, in our
case the base addr of the IMSIC controller. This attribute is being set
only when running irqchip_split() mode with aia=aplic-imsic.
During riscv_aplic_msi_send() we'll check if the attribute was set to
skip the check for a m-mode APLIC controller and to change the resulting
MSI addr by adding kvm_msicfgaddr right before address_space_stl_le().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
The current logic to determine if we don't need an emulated APLIC
controller, i.e. KVM will provide for us, is to determine if we're
running KVM, with in-kernel irqchip support, and running
aia=aplic-imsic. This is modelled by riscv_is_kvm_aia_aplic_imsic() and
virt_use_kvm_aia_aplic_imsic().
This won't suffice to support irqchip_split() mode: it will match
exactly the same conditions as the one above, but setting the irqchip to
'split' mode will now require us to emulate an APLIC s-mode controller,
like we're doing with 'aia=aplic'.
Create a new riscv_use_emulated_aplic() helper that will encapsulate
this logic. Replace the uses of "riscv_is_kvm_aia_aplic_imsic()" with
this helper every time we're taking a decision on emulate an APLIC
controller or not. Do the same in virt.c with virt_use_emulated_aplic().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Similar to the riscv_is_kvm_aia_aplic_imsic() helper from riscv_aplic.c,
the existing virt_use_kvm_aia() is testing for KVM aia=aplic-imsic with
in-kernel irqchip enabled. It is not checking for a generic AIA support.
Rename the helper to virt_use_kvm_aia_aplic_imsic() to reflect what the
helper is doing, and use the existing riscv_is_kvm_aia_aplic_imsic() to
obscure details such as the presence of the in-kernel irqchip.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
In create_fdt_sockets() we have the following pattern:
if (kvm_enabled() && virt_use_kvm_aia(s)) {
(... do stuff ...)
} else {
(... do other stuff ...)
}
if (kvm_enabled() && virt_use_kvm_aia(s)) {
(... do more stuff ...)
} else {
(... do more other stuff)
}
Do everything in a single if/else clause to reduce the usage of
virt_use_kvm_aia() helper and to make the code a bit less repetitive.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
The helper is_kvm_aia() is checking not only for AIA, but for
aplic-imsic (i.e. "aia=aplic-imsic" in 'virt' RISC-V machine) with an
in-kernel chip present.
Rename it to be a bit clear what the helper is doing since we'll add
more AIA helpers in the next patches.
Make the helper public because the 'virt' machine will use it as well.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|
|
Add a riscv_iommu_reset() helper in the base emulation code that
implements the expected reset behavior as defined by the riscv-iommu
spec.
Devices can then use this helper in their own reset callbacks.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
|