aboutsummaryrefslogtreecommitdiff
path: root/migration/tls.c
diff options
context:
space:
mode:
authorAvihai Horon <avihaih@nvidia.com>2024-03-04 12:53:37 +0200
committerPeter Xu <peterx@redhat.com>2024-03-11 14:41:40 -0400
commit4e1871c450a14e38b09d4e312922eefd475c1c64 (patch)
tree0eda46f09de0f268640cb2859fae47bbdf423801 /migration/tls.c
parent7489f7f3f81dcb776df8c1b9a9db281fc21bf05f (diff)
downloadqemu-4e1871c450a14e38b09d4e312922eefd475c1c64.zip
qemu-4e1871c450a14e38b09d4e312922eefd475c1c64.tar.gz
qemu-4e1871c450a14e38b09d4e312922eefd475c1c64.tar.bz2
migration: Don't serialize devices in qemu_savevm_state_iterate()
Commit 90697be8896c ("live migration: Serialize vmstate saving in stage 2") introduced device serialization in qemu_savevm_state_iterate(). The rationale behind it was to first complete migration of slower changing block devices and only then migrate the RAM, to avoid sending fast changing RAM pages over and over. This commit was added a long time ago, and while it was useful back then, it is not the case anymore: 1. Block migration is deprecated, see commit 66db46ca83b8 ("migration: Deprecate block migration"). 2. Today there are other iterative devices besides RAM and block, such as VFIO, which are registered for migration after RAM. With current serialization behavior, a fast changing device can block other devices from sending their data, which may prevent migration from converging in some cases. The issue described in item 2 was observed in several VFIO migration scenarios with switchover-ack capability enabled, where some workload on the VM prevented RAM from ever reaching a hard zero, thus blocking VFIO initial pre-copy data from being sent. Hence, destination could not ack switchover and migration could not converge. Fix that by not serializing iterative devices in qemu_savevm_state_iterate(). Note that this still doesn't fully prevent device starvation. As correctly pointed out by Peter [1], a fast changing device might constantly consume all allocated bandwidth and block the following devices. However, this scenario is more likely to happen only if max-bandwidth is low. [1] https://lore.kernel.org/qemu-devel/Zd6iw9dBhW6wKNxx@x1n/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240304105339.20713-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
Diffstat (limited to 'migration/tls.c')
0 files changed, 0 insertions, 0 deletions