aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-01-29migration: Merge precopy/postcopy on switchover startPeter Xu1-30/+32
Now after all the cleanups, finally we can merge the switchover startup phase into one single function for precopy/postcopy. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-16-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Always set DEVICE statePeter Xu8-34/+64
DEVICE state was introduced back in 2017: https://lore.kernel.org/qemu-devel/20171020090556.18631-1-dgilbert@redhat.com/ Quote from Dave's cover letter, when the pre-switchover phase was enabled, the state transition looks like this: The precopy flow is: active->pre-switchover->device->completed The postcopy flow is: active->pre-switchover->postcopy-active->completed To supplement above, when the cap is not enabled: The precopy flow is: active->completed The postcopy flow is: active->postcopy-active->completed It works for us, though we have some code just to special case these state transitions, so the DEVICE state currently is special only to precopy, and only conditionally. I had a quick discussion with Libvirt developers, it turns out that this may not be necessary. IOW, it seems okay we can have DEVICE state to be generic, so that we don't have over-complicated state machines. It not only helps align all the migration state machine, help cleanup the code path especially on pre-switchover handling (see the patch itself), another side benefit is we can unconditionally have a specific state to mark the switchover phase, which might be helpful for debugging too. This patch makes the DEVICE state to be present always, marking that source QEMU is switching over. Then the state machine will be always as simple as: active-> [pre-switchover->] -> device -> [postcopy-active->] -> complete After the change, no matter whether pre-switchover or postcopy is enabled or not, we always have DEVICE state showing the switchover phase. When pre-switchover enabled, we'll have an extra stage before that. When postcopy is enabled, we'll have an extra stage after that. A few qtests need touch up in QEMU tree for this change: - A few iotest outputs (194, 203, 234, 262, 280) - Teach libqos's migrate() on "device" state Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Dr. David Alan Gilbert <dave@treblig.org> Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-15-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Cleanup qemu_savevm_state_complete_precopy()Peter Xu1-13/+7
Now qemu_savevm_state_complete_precopy() is never used in postcopy, clean it up as in_postcopy==false now unconditionally. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-14-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Unwrap qemu_savevm_state_complete_precopy() in postcopyPeter Xu3-3/+12
Postcopy invokes qemu_savevm_state_complete_precopy() twice for a long time, and that caused way too much confusions. Let's clean this up and make postcopy easier to read. It's actually fairly straightforward: postcopy starts with saving non-postcopiable iterables, then later it saves again with non-iterable only. Move these two calls out makes everything much easier to follow. Otherwise it's very unclear what qemu_savevm_state_complete_precopy() did in either of the calls. No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-13-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Notify COMPLETE once for postcopyPeter Xu3-8/+16
Postcopy invokes qemu_savevm_state_complete_precopy() twice, that means it'll invoke COMPLETE notify twice.. also twice the tracepoints that marking precopy complete. Move that notification (along with the tracepoint) out to the caller, so that postcopy will only notify once right at the start of switchover phase from precopy. When at it, rename it to suite the file now it locates. For precopy, there should have no functional change except the tracepoint has a name change. For the other two users of qemu_savevm_state_complete_precopy(), namely: qemu_savevm_state() and qemu_savevm_live_state(): the notifier shouldn't matter because they're not precopy at all. Now in these two contexts (aka, "savevm", and "colo") sometimes the precopy notifiers will still be invoked, but that's outside the scope of this patch. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-12-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Take BQL slightly longer in postcopy_start()Peter Xu1-3/+2
This paves way for some follow up patch to modify migration states at the end of postcopy_start(), which should better be with the BQL so that there's no way of concurrent cancellation. So we'll do something slightly more with BQL but they're really trivial, hopefully nothing will really chance with this. A side benefit is we can drop another explicit lock() in failure path. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-11-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Drop cached migration state in migration_maybe_pause()Peter Xu1-19/+8
I can't see why we must cache the state now after we avoided possible CANCEL race: that's the only thing I can think of that can modify the migration state concurrently with the migration thread itself. Make all the state updates to happen always, then we don't need to cache the state anymore. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-10-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Adjust locking in migration_maybe_pause()Peter Xu1-2/+2
In migration_maybe_pause() QEMU may yield BQL before waiting for a semaphore. However it yields the BQL too early, which logically gives it chance for the main thread to quickly take the BQL and modify the state to CANCELLING. To avoid such race condition from happening at all, always update the migration states within the BQL. It'll make sure no concurrent cancellation can ever happen. With that, IIUC there's chance we can remove the extra parameter in migration_maybe_pause() to update active state, but that'll be done separately later. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-9-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Adjust postcopy bandwidth during switchoverPeter Xu1-7/+9
Precopy uses unlimited bandwidth always during switchover, it makes sense because this is so critical and no one would like to throttle bandwidth during the VM blackout. OTOH, postcopy surprisingly didn't do that. There's one line that in the middle of the postcopy switchover it tries to switch to postcopy's specified max-postcopy-bandwidth, but even so it's somewhere in the middle which is strange. This patch brings the two modes to always use unlimited bandwidth for switchover, meanwhile only apply the postcopy max bandwidth after the switchover is completed. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-8-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Synchronize all CPU states only for non-iterable dumpPeter Xu2-7/+4
Do one shot cpu sync at qemu_savevm_state_complete_precopy_non_iterable(), instead of coding it separately in two places. Note that in the context of qemu_savevm_state_complete_precopy(), this patch is also an optimization for postcopy path, in that we can avoid sync cpu twice during switchover: before this patch, postcopy_start() invokes twice on qemu_savevm_state_complete_precopy(), each of them will try to sync CPU info. In reality, only one of them would be enough. For background snapshot, there's no intended functional change. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-7-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Drop inactivate_disk param in qemu_savevm_state_complete*Peter Xu3-31/+23
This parameter is only used by one caller, which is the genuine precopy complete path (migration_completion_precopy). The parameter was introduced in a1fbe750fd ("migration: Fix race of image locking between src and dst") to make sure the inactivate will happen before EOF to make sure dest will always be able to activate the disk properly. However there's no limitation on how early we inactivate the disk. For precopy completion path, we can always do that as long as VM is stopped. Move the disk inactivate there, then we can remove this inactivate_disk parameter in the whole call stack, because all the rest users pass in false always. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-6-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Avoid two src-downtime-end tracepoints for postcopyPeter Xu1-2/+1
Postcopy can trigger this tracepoint twice, while only the 1st one is valid. Avoid triggering the 2nd tracepoint just like what we do with recording the total downtime. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-5-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Optimize postcopy on downtime by avoiding JSON writerPeter Xu1-2/+15
postcopy_start() is the entry function that postcopy is destined to start. It also means QEMU source will not dump VM description, aka, the JSON writer is garbage now. We can leave that to be cleaned up when migration completes, however when with the JSON writer object being present, vmstate_save() will still try to construct the JSON objects for the VM descriptions, even though it'll never be used later if it's postcopy. To save those cycles, release the JSON writer earlier for postcopy. Then vmstate_save() later will be smart enough to skip the JSON object constructions completely. It can logically reduce downtime because all such JSON constructions happen during postcopy blackout. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-4-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Do not construct JSON description if suppressedPeter Xu3-26/+33
QEMU machine has a property "suppress-vmdesc". When it is enabled, QEMU will stop attaching JSON VM description at the end of the precopy migration stream (postcopy is never affected because postcopy never attach that). However even if it's suppressed by the user, the source QEMU will still construct the JSON descriptions, which is a complete waste of CPU and memory resources. To avoid it, only create the JSON writer object if suppress-vmdesc is not specified. Luckily, vmstate_save() already supports vmdesc==NULL, so only a few spots that are left to be prepared that vmdesc can be NULL now. When at it, move the init / destroy of the JSON writer object to start / end of the migration - the JSON writer object is a sub-struct of migration state, and that looks like the only object that was dynamically allocated / destroyed within migration process. Make it the same as the rest objects that migration uses. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-3-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: Remove postcopy implications in should_send_vmdesc()Peter Xu1-10/+11
should_send_vmdesc() has a hack inside (which was not reflected in the function name) in that it tries to detect global postcopy state and that will affect the value to be returned. It's easier to keep the helper simple by only check the suppress-vmdesc property. Then: - On the sender side of its usage, there's already in_postcopy variable that we can use: postcopy doesn't send vmdesc at all, so directly skip everything for postcopy. - On the recv side, when reaching vmdesc processing it must be precopy code already, hence that hack check never used to work anyway. No functional change intended, except a trivial side effect that QEMU source will start to avoid running some JSON helper in postcopy path, but that would only reduce the postcopy blackout window a bit, rather than any other bad side effect. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250114230746.3268797-2-peterx@redhat.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: cpr-transfer documentationSteve Sistare1-2/+182
Add documentation for the cpr-transfer migration mode. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-25-git-send-email-steven.sistare@oracle.com [add -machine memory-backend=ram0] Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration-test: cpr-transferSteve Sistare3-0/+88
Add a migration test for cpr-transfer mode. Defer the connection to the target monitor, else the test hangs because in cpr-transfer mode QEMU does not listen for monitor connections until we send the migrate command to source QEMU. To test -incoming defer, send a migrate incoming command to the target, after sending the migrate command to the source, as required by cpr-transfer mode. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-24-git-send-email-steven.sistare@oracle.com [only allocate in_channels when needed] Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29tests/qtest: assert qmp connectedSteve Sistare1-0/+4
Assert that qmp_fd is valid when we communicate with the monitor. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1736967650-129648-23-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29tests/qtest: enhance migration channelsSteve Sistare6-25/+76
Change the migrate_qmp and migrate_qmp_fail channels argument to a QObject type so the caller can manipulate the object before passing it to the helper. Define migrate_str_to_channel to aid such manipulation. Add a channels argument to migrate_incoming_qmp. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-22-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration-test: defer connectionSteve Sistare2-3/+22
Add an option to defer connection to the target monitor, needed by the cpr-transfer test. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1736967650-129648-21-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29tests/qtest: defer connectionSteve Sistare3-40/+90
Add an option to defer making the connecting to the monitor and qtest sockets when calling qtest_init_with_env. The client makes the connection later by calling qtest_connect and qtest_qmp_handshake. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-20-git-send-email-steven.sistare@oracle.com [plumb capabilities list into qtest_qmp_handshake] Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29tests/qtest: optimize migrate_set_portsSteve Sistare1-8/+15
Do not query connection parameters if all port numbers are known. This is more efficient, and also solves a problem for the cpr-transfer test. At the point where cpr-transfer calls migrate_qmp and migrate_set_ports, the monitor is not connected and queries are not allowed. Port=0 is never used for cpr-transfer. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-19-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration-test: memory_backendSteve Sistare2-4/+16
Allow each migration test to define its own memory backend, replacing the standard "-m <size>" specification. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1736967650-129648-18-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: cpr-transfer modeSteve Sistare11-10/+210
Add the cpr-transfer migration mode, which allows the user to transfer a guest to a new QEMU instance on the same host with minimal guest pause time, by preserving guest RAM in place, albeit with new virtual addresses in new QEMU, and by preserving device file descriptors. Pages that were locked in memory for DMA in old QEMU remain locked in new QEMU, because the descriptor of the device that locked them remains open. cpr-transfer preserves memory and devices descriptors by sending them to new QEMU over a unix domain socket using SCM_RIGHTS. Such CPR state cannot be sent over the normal migration channel, because devices and backends are created prior to reading the channel, so this mode sends CPR state over a second "cpr" migration channel. New QEMU reads the cpr channel prior to creating devices or backends. The user specifies the cpr channel in the channel arguments on the outgoing side, and in a second -incoming command-line parameter on the incoming side. The user must start old QEMU with the the '-machine aux-ram-share=on' option, which allows anonymous memory to be transferred in place to the new process by transferring a memory descriptor for each ram block. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The user starts new QEMU on the same host as old QEMU, with command-line arguments to create the same machine, plus the -incoming option for the main migration channel, like normal live migration. In addition, the user adds a second -incoming option with channel type "cpr". This CPR channel must support file descriptor transfer with SCM_RIGHTS, i.e. it must be a UNIX domain socket. To initiate CPR, the user issues a migrate command to old QEMU, adding a second migration channel of type "cpr" in the channels argument. Old QEMU stops the VM, saves state to the migration channels, and enters the postmigrate state. New QEMU mmap's memory descriptors, and execution resumes. The implementation splits qmp_migrate into start and finish functions. Start sends CPR state to new QEMU, which responds by closing the CPR channel. Old QEMU detects the HUP then calls finish, which connects the main migration channel. In summary, the usage is: qemu-system-$arch -machine aux-ram-share=on ... start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>" Issue commands to old QEMU: migrate_set_parameter mode cpr-transfer {"execute": "migrate", ... {"channel-type": "main"...}, {"channel-type": "cpr"...} ... } Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-17-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingStefan Hajnoczi18-829/+1397
* target/i386: optimize string instructions * target/i386: new Sierra Forest and Clearwater Forest models * rust: type-safe vmstate implementation * rust: use interior mutability for PL011 * rust: clean ups * memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED * gitlab-ci: enable Rust backtraces # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH # V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79 # F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z # p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n # GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb # 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw== # =ZKFy # -----END PGP SIGNATURE----- # gpg: Signature made Wed 29 Jan 2025 03:39:50 EST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (49 commits) gitlab-ci: include full Rust backtraces in test runs rust: qemu-api: add sub-subclass to the integration tests rust/zeroable: Implement Zeroable with const_zero macro rust: qdev: make reset take a shared reference rust: pl011: drop use of ControlFlow rust: pl011: pull device-specific code out of MemoryRegionOps callbacks rust: pl011: remove duplicate definitions rust: pl011: wrap registers with BqlRefCell rust: pl011: extract PL011Registers rust: pl011: pull interrupt updates out of read/write ops rust: pl011: extract CharBackend receive logic into a separate function rust: pl011: extract conversion to RegisterOffset rust: pl011: hide unnecessarily "pub" items from outside pl011::device rust: pl011: remove unnecessary "extern crate" rust: prefer NonNull::new to assertions rust: vmstate: make order of parameters consistent in vmstate_clock rust: vmstate: remove translation of C vmstate macros rust: pl011: switch vmstate to new-style macros rust: qemu_api: add vmstate_struct rust: vmstate: add public utility macros to implement VMState ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-29Merge tag 'pull-target-arm-20250128-1' of ↵Stefan Hajnoczi49-337/+452
https://git.linaro.org/people/pmaydell/qemu-arm into staging target-arm queue: * hw/arm: Remove various uses of first_cpu global * hw/char/imx_serial: Fix reset value of UFCR register * hw/char/imx_serial: Update all state before restarting ageing timer * hw/pci-host/designware: Expose MSI IRQ * hw/arm/stellaris: refactoring, cleanup * hw/arm/stellaris: map both I2C controllers * tests/functional: Add a test for the arm microbit machine * target/arm: arm_reset_sve_state() should set FPSR, not FPCR * target/arm: refactorings preparatory to FEAT_AFP implementation * fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed * fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed * hw/usb/canokey: Fix buffer overflow for OUT packet # -----BEGIN PGP SIGNATURE----- # # iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmeZOi0ZHHBldGVyLm1h # eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3sUeEACwDhM4ldn/gVZgVN7nf42a # /CLD/qJx1vqi5bAB5zkY1bSCR9hS2IkhTBoQQH9Ng6ztG1IRpT/tKXDJAemWty70 # XgExdl4yjdwXMQK4JKU9qSfaBTuX7Z8Hz+nA1AnblO/4H+XpVNVJzp8Ee/uWTyEd # BKPBpwqbIXNwUWEqkzDok074Q05rHlhsJD2DsoJTcmtpROhLHLATwQDZGGFuf56H # LVcdx6GRP+/mWEGWLtj19mvaR/2cn4rQf+I1MACZ81nRjQCHbCohNAMr2wFsKg1+ # 2jYk9uHdFoambJ5+mFuC55Efk+QJaP4vDR0Gf3jLloFr+rS/5h3HiUuD8dUWOwFd # mPWXsjwYzqBW2knt1nfq1ByzYWZ8rVQEn5G53dX/eoNXuDGsonZxPnevgmv5kIUc # /W618Jez1nu9RDtNKccobHEtTGlGInJxJ7YzkU7Q6FO80IAqSdV7t9v7uPLJwcnz # nQz+wVzb4oOmwMzn3BpKY7N/S7IZOSy3ASNHj8o4yCHMJT8Ki0/N4bl0k0DLxJ0T # RiNCsV9c7MJfo9a+pbOnu0Lc3SjjropdvHYU+bB7R0mgd8ysN+Tou0dpa+i7tUTu # DHWqs2/+UApHKBiC+DSynPjjRR2aT/5lYFncGaiEVoEQttPLka3SAzgHPVQZs1zD # bxZkEAFktAFGIjU70fYNkg== # =H4p7 # -----END PGP SIGNATURE----- # gpg: Signature made Tue 28 Jan 2025 15:12:29 EST # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full] # gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * tag 'pull-target-arm-20250128-1' of https://git.linaro.org/people/pmaydell/qemu-arm: (36 commits) hw/usb/canokey: Fix buffer overflow for OUT packet target/arm: Use FPST_A64_F16 for halfprec-to-other conversions target/arm: Remove redundant advsimd float16 helpers fpu: Fix a comment in softfloat-types.h fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16 target/arm: Use FPST_A64_F16 in A64 decoder target/arm: Use FPST_A32_F16 in A32 decoder target/arm: Use fp_status_f16_a64 in AArch64-only helpers target/arm: Use fp_status_f16_a32 in AArch32-only helpers target/arm: Define new fp_status_f16_a32 and fp_status_f16_a64 target/arm: Remove now-unused vfp.fp_status and FPST_FPCR target/arm: Use FPST_A64 in A64 decoder target/arm: Use FPST_A32 in A32 decoder target/arm: Use fp_status_a32 in vfp_cmp helpers target/arm: Use fp_status_a32 in vjvct helper target/arm: Use fp_status_a64 or fp_status_a32 in is_ebf() target/arm: Use vfp.fp_status_a64 in A64-only helper functions target/arm: Define new fp_status_a32 and fp_status_a64 ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-29migration: cpr-transfer save and loadSteve Sistare4-0/+77
Add functions to create a QEMUFile based on a unix URI, for saving or loading, for use by cpr-transfer mode to preserve CPR state. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-16-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: VMSTATE_FDSteve Sistare2-0/+32
Define VMSTATE_FD for declaring a file descriptor field in a VMStateDescription. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: SCM_RIGHTS for QEMUFileSteve Sistare3-4/+84
Define functions to put/get file descriptors to/from a QEMUFile, for qio channels that support SCM_RIGHTS. Maintain ordering such that put(A), put(fd), put(B) followed by get(A), get(fd), get(B) always succeeds. Other get orderings may succeed but are not guaranteed. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-14-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: incoming channelSteve Sistare3-8/+70
Extend the -incoming option to allow an @MigrationChannel to be specified. This allows channels other than 'main' to be described on the command line, which will be needed for CPR. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-13-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: enhance migrate_uri_parseSteve Sistare3-2/+18
Export migrate_uri_parse for use outside migration internals, and define a method migrate_is_uri that indicates when migrate_uri_parse should be used. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-12-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29hostmem-shm: preserve for cprSteve Sistare1-3/+9
Preserve memory-backend-shm memory objects during cpr-transfer. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-11-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29hostmem-memfd: preserve for cprSteve Sistare1-3/+9
Preserve memory-backend-memfd memory objects during cpr-transfer. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-10-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29physmem: preserve ram blocks for cprSteve Sistare1-5/+39
Save the memfd for ramblocks in CPR state, along with a name that uniquely identifies it. The block's idstr is not yet set, so it cannot be used for this purpose. Find the saved memfd in new QEMU when creating a block. If size of a resizable block is larger in new QEMU, extend it via the file_ram_alloc truncate parameter, and the extra space will be usable after a guest reset. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: cpr-stateSteve Sistare5-0/+232
CPR must save state that is needed after QEMU is restarted, when devices are realized. Thus the extra state cannot be saved in the migration channel, as objects must already exist before that channel can be loaded. Instead, define auxilliary state structures and vmstate descriptions, not associated with any registered object, and serialize the aux state to a cpr-specific channel in cpr_state_save. Deserialize in cpr_state_load after QEMU restarts, before devices are realized. Provide accessors for clients to register file descriptors for saving. The mechanism for passing the fd's to the new process will be specific to each migration mode, and added in subsequent patches. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29machine: aux-ram-share optionSteve Sistare4-0/+37
Allocate auxilliary guest RAM as an anonymous file that is shareable with an external process. This option applies to memory allocated as a side effect of creating various devices. It does not apply to memory-backend-objects, whether explicitly specified on the command line, or implicitly created by the -m command line option. This option is intended to support new migration modes, in which the memory region can be transferred in place to a new QEMU process, by sending the memfd file descriptor to the process. Memory contents are preserved, and if the mode also transfers device descriptors, then pages that are locked in memory for DMA remain locked. This behavior is a pre-requisite for supporting vfio, vdpa, and iommufd devices with the new modes. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29memory: add RAM_PRIVATESteve Sistare6-7/+26
Define the RAM_PRIVATE flag. In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter, in a subsequent patch the implementation may still create a shared mapping if other conditions require it. Callers who specifically want a private mapping, eg for objects specified by the user, must pass RAM_PRIVATE. After RAMBlock creation, MAP_SHARED in the block's flags indicates whether the block is shared or private, and MAP_PRIVATE is omitted. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29physmem: fd-based shared memorySteve Sistare3-4/+70
Create MAP_SHARED RAMBlocks by mmap'ing a file descriptor rather than using MAP_ANON, so the memory can be accessed in another process by passing and mmap'ing the fd. This will allow CPR to support memory-backend-ram and memory-backend-shm objects, provided the user creates them with share=on. Use memfd_create if available because it has no constraints. If not, use POSIX shm_open. However, allocation on the opened fd may fail if the shm mount size is too small, even if the system has free memory, so for backwards compatibility fall back to qemu_anon_ram_alloc/MAP_ANON on failure. For backwards compatibility on Windows, always use MAP_ANON. share=on has no purpose there, but the syntax is accepted, and must continue to work. Lastly, quietly fall back to MAP_ANON if the system does not support qemu_ram_alloc_from_fd. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29physmem: qemu_ram_alloc_from_fd extensionsSteve Sistare3-21/+31
Extend qemu_ram_alloc_from_fd to support resizable ram, and define qemu_ram_resize_cb to clean up the API. Add a grow parameter to extend the file if necessary. However, if grow is false, a zero-sized file is always extended. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1736967650-129648-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29physmem: fix qemu_ram_alloc_from_fd size calculationSteve Sistare1-4/+6
qemu_ram_alloc_from_fd allocates space if file_size == 0. If non-zero, it uses the existing space and verifies it is large enough, but the verification was broken when the offset parameter was introduced. As a result, a file smaller than offset passes the verification and causes errors later. Fix that, and update the error message to include offset. Peter provides this concise reproducer: $ touch ramfile $ truncate -s 64M ramfile $ ./qemu-system-x86_64 -object memory-backend-file,mem-path=./ramfile,offset=128M,size=128M,id=mem1,prealloc=on qemu-system-x86_64: qemu_prealloc_mem: preallocating memory failed: Bad address With the fix, the error message is: qemu-system-x86_64: mem1 backing store size 0x4000000 is too small for 'size' option 0x8000000 plus 'offset' option 0x8000000 Cc: qemu-stable@nongnu.org Fixes: 4b870dc4d0c0 ("hostmem-file: add offset option") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29backends/hostmem-shm: factor out allocation of "anonymous shared memory with ↵Steve Sistare5-43/+69
an fd" Let's factor it out so we can reuse it. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29migration: fix -Werror=maybe-uninitializedMarc-André Lureau1-1/+1
../migration/savevm.c: In function ‘qemu_savevm_state_complete_precopy_non_iterable’: ../migration/savevm.c:1560:20: error: ‘ret’ may be used uninitialized [-Werror=maybe-uninitialized] 1560 | return ret; | ^~~ Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20250114104811.2612846-1-marcandre.lureau@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29gitlab-ci: include full Rust backtraces in test runsPaolo Bonzini1-0/+1
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-28hw/usb/canokey: Fix buffer overflow for OUT packetHongren Zheng2-7/+3
When USBPacket in OUT direction has larger payload than the ep_out_buffer (of size 512), a buffer overflow would occur. It could be fixed by limiting the size of usb_packet_copy to be at most buffer size. Further optimization gets rid of the ep_out_buffer and directly uses ep_out as the target buffer. This is reported by a security researcher who artificially constructed an OUT packet of size 2047. The report has gone through the QEMU security process, and as this device is for testing purpose and no deployment of it in virtualization environment is observed, it is triaged not to be a security bug. Cc: qemu-stable@nongnu.org Fixes: d7d34918551dc48 ("hw/usb: Add CanoKey Implementation") Reported-by: Juan Jose Lopez Jaimez <thatjiaozi@gmail.com> Signed-off-by: Hongren Zheng <i@zenithal.me> Message-id: Z4TfMOrZz6IQYl_h@Sun Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-01-28target/arm: Use FPST_A64_F16 for halfprec-to-other conversionsPeter Maydell2-5/+8
We should be using the F16-specific float_status for conversions from half-precision, because halfprec inputs never set Input Denormal. Without FEAT_AHP, using the wrong fpst here had no effect, because the only difference between the A64_F16 and A64 fpst is its handling of flush-to-zero on input and output, and the helper functions vfp_fcvt_f16_to_* and vfp_fcvt_*_to_f16 all explicitly squash the relevant flushing flags, and flush_inputs_to_zero was the only way that IDC could be set. With FEAT_AHP, the FPCR.AH=1 behaviour sets IDC for input_denormal_used, which we will only ignore in vfp_get_fpsr_from_host() for the A64_F16 fpst; so it matters that we use that one for f16 inputs (and the normal one for single/double to f16 conversions). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-27-peter.maydell@linaro.org
2025-01-28target/arm: Remove redundant advsimd float16 helpersPeter Maydell3-25/+8
The advsimd_addh etc helpers defined in helper-a64.c are identical to the vfp_addh etc helpers defined in helper-vfp.c: both take two float16 inputs (in a uint32_t type) plus a float_status* and are simple wrappers around the softfloat float16_* functions. (The duplication seems to be a historical accident: we added the advsimd helpers in 2018 as part of the A64 implementation, and at that time there was no f16 emulation in A32. Then later we added the A32 f16 handling by extending the existing VFP helper macros to generate f16 versions as well as f32 and f64, and didn't realise we could clean things up.) Remove the now-unnecessary advsimd helpers and make the places that generated calls to them use the vfp helpers instead. Many of the helper functions were already unused. (The remaining advsimd_ helpers are those which don't have vfp versions.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-26-peter.maydell@linaro.org
2025-01-28fpu: Fix a comment in softfloat-types.hPeter Maydell1-1/+1
In softfloat-types.h a comment documents that if the float_status field flush_to_zero is set then we flush denormalised results to 0 and set the inexact flag. This isn't correct: the status flag that we set when flush_to_zero causes us to flush an output to zero is float_flag_output_denormal_flushed. Correct the comment. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-22-peter.maydell@linaro.org
2025-01-28fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushedPeter Maydell9-11/+12
Our float_flag_output_denormal exception flag is set when the fpu code flushes an output denormal to zero. Rename it to float_flag_output_denormal_flushed: * this keeps it parallel with the flag for flushing input denormals, which we just renamed * it makes it clearer that it doesn't mean "set when the output is a denormal" Commit created with for f in `git grep -l float_flag_output_denormal`; do sed -i -e 's/float_flag_output_denormal/float_flag_output_denormal_flushed/' $f; done Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-21-peter.maydell@linaro.org
2025-01-28fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushedPeter Maydell8-18/+19
Our float_flag_input_denormal exception flag is set when the fpu code flushes an input denormal to zero. This is what many guest architectures (eg classic Arm behaviour) require, but it is not the only donarmal-related reason we might want to set an exception flag. The x86 behaviour (which we do not currently model correctly) wants to see an exception flag when a denormal input is *not* flushed to zero and is actually used in an arithmetic operation. Arm's FEAT_AFP also wants these semantics. Rename float_flag_input_denormal to float_flag_input_denormal_flushed to make it clearer when it is set and to allow us to add a new float_flag_input_denormal_used next to it for the x86/FEAT_AFP semantics. Commit created with for f in `git grep -l float_flag_input_denormal`; do sed -i -e 's/float_flag_input_denormal/float_flag_input_denormal_flushed/' $f; done and manual editing of softfloat-types.h and softfloat.c to clean up the indentation afterwards and to fix a comment which wasn't using the full name of the flag. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-20-peter.maydell@linaro.org
2025-01-28target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16Peter Maydell4-16/+0
Now we have moved all the uses of vfp.fp_status_f16 and FPST_FPCR_F16 to the new A32 or A64 fields, we can remove these. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250124162836.2332150-19-peter.maydell@linaro.org