aboutsummaryrefslogtreecommitdiff
path: root/migration/ram.c
AgeCommit message (Collapse)AuthorFilesLines
2022-08-02Revert "migration: Simplify unqueue_page()"Thomas Huth1-11/+26
This reverts commit cfd66f30fb0f735df06ff4220e5000290a43dad3. The simplification of unqueue_page() introduced a bug that sometimes breaks migration on s390x hosts. The problem is not fully understood yet, but since we are already in the freeze for QEMU 7.1 and we need something working there, let's revert this patch for the upcoming release. The optimization can be redone later again in a proper way if necessary. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2099934 Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220802061949.331576-1-thuth@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-07-20migration/multifd: Report to user when zerocopy not workingLeonardo Bras1-0/+5
Some errors, like the lack of Scatter-Gather support by the network interface(NETIF_F_SG) may cause sendmsg(...,MSG_ZEROCOPY) to fail on using zero-copy, which causes it to fall back to the default copying mechanism. After each full dirty-bitmap scan there should be a zero-copy flush happening, which checks for errors each of the previous calls to sendmsg(...,MSG_ZEROCOPY). If all of them failed to use zero-copy, then increment dirty_sync_missed_zero_copy migration stat to let the user know about it. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Message-Id: <20220711211112.18951-4-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-07-20migration: Respect postcopy request order in preemption modePeter Xu1-13/+52
With preemption mode on, when we see a postcopy request that was requesting for exactly the page that we have preempted before (so we've partially sent the page already via PRECOPY channel and it got preempted by another postcopy request), currently we drop the request so that after all the other postcopy requests are serviced then we'll go back to precopy stream and start to handle that. We dropped the request because we can't send it via postcopy channel since the precopy channel already contains partial of the data, and we can only send a huge page via one channel as a whole. We can't split a huge page into two channels. That's a very corner case and that works, but there's a change on the order of postcopy requests that we handle since we're postponing this (unlucky) postcopy request to be later than the other queued postcopy requests. The problem is there's a possibility that when the guest was very busy, the postcopy queue can be always non-empty, it means this dropped request will never be handled until the end of postcopy migration. So, there's a chance that there's one dest QEMU vcpu thread waiting for a page fault for an extremely long time just because it's unluckily accessing the specific page that was preempted before. The worst case time it needs can be as long as the whole postcopy migration procedure. It's extremely unlikely to happen, but when it happens it's not good. The root cause of this problem is because we treat pss->postcopy_requested variable as with two meanings bound together, as the variable shows: 1. Whether this page request is urgent, and, 2. Which channel we should use for this page request. With the old code, when we set postcopy_requested it means either both (1) and (2) are true, or both (1) and (2) are false. We can never have (1) and (2) to have different values. However it doesn't necessarily need to be like that. It's very legal that there's one request that has (1) very high urgency, but (2) we'd like to use the precopy channel. Just like the corner case we were discussing above. To differenciate the two meanings better, introduce a new field called postcopy_target_channel, showing which channel we should use for this page request, so as to cover the old meaning (2) only. Then we leave the postcopy_requested variable to stand only for meaning (1), which is the urgency of this page request. With this change, we can easily boost priority of a preempted precopy page as long as we know that page is also requested as a postcopy page. So with the new approach in get_queued_page() instead of dropping that request, we send it right away with the precopy channel so we get back the ordering of the page faults just like how they're requested on dest. Reported-by: Manish Mishra <manish.mishra@nutanix.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Manish Mishra <manish.mishra@nutanix.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185520.27583-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-07-20migration: Add property x-postcopy-preempt-break-hugePeter Xu1-0/+7
Add a property field that can conditionally disable the "break sending huge page" behavior in postcopy preemption. By default it's enabled. It should only be used for debugging purposes, and we should never remove the "x-" prefix. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Manish Mishra <manish.mishra@nutanix.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185511.27366-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-07-20migration: Postcopy preemption enablementPeter Xu1-8/+243
This patch enables postcopy-preempt feature. It contains two major changes to the migration logic: (1) Postcopy requests are now sent via a different socket from precopy background migration stream, so as to be isolated from very high page request delays. (2) For huge page enabled hosts: when there's postcopy requests, they can now intercept a partial sending of huge host pages on src QEMU. After this patch, we'll live migrate a VM with two channels for postcopy: (1) PRECOPY channel, which is the default channel that transfers background pages; and (2) POSTCOPY channel, which only transfers requested pages. There's no strict rule of which channel to use, e.g., if a requested page is already being transferred on precopy channel, then we will keep using the same precopy channel to transfer the page even if it's explicitly requested. In 99% of the cases we'll prioritize the channels so we send requested page via the postcopy channel as long as possible. On the source QEMU, when we found a postcopy request, we'll interrupt the PRECOPY channel sending process and quickly switch to the POSTCOPY channel. After we serviced all the high priority postcopy pages, we'll switch back to PRECOPY channel so that we'll continue to send the interrupted huge page again. There's no new thread introduced on src QEMU. On the destination QEMU, one new thread is introduced to receive page data from the postcopy specific socket (done in the preparation patch). This patch has a side effect: after sending postcopy pages, previously we'll assume the guest will access follow up pages so we'll keep sending from there. Now it's changed. Instead of going on with a postcopy requested page, we'll go back and continue sending the precopy huge page (which can be intercepted by a postcopy request so the huge page can be sent partially before). Whether that's a problem is debatable, because "assuming the guest will continue to access the next page" may not really suite when huge pages are used, especially if the huge page is large (e.g. 1GB pages). So that locality hint is much meaningless if huge pages are used. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185504.27203-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-07-20migration: Postcopy preemption preparation on channel creationPeter Xu1-7/+18
Create a new socket for postcopy to be prepared to send postcopy requested pages via this specific channel, so as to not get blocked by precopy pages. A new thread is also created on dest qemu to receive data from this new channel based on the ram_load_postcopy() routine. The ram_load_postcopy(POSTCOPY) branch and the thread has not started to function, and that'll be done in follow up patches. Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new thread too to make sure it'll be recycled properly. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185502.27149-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: With Peter's fix to quieten compiler warning on start_migration
2022-06-23migration: remove the QEMUFileOps abstractionDaniel P. Berrangé1-2/+1
Now that all QEMUFile callbacks are removed, the entire concept can be deleted. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22migration: rename qemu_update_position to qemu_file_credit_transferDaniel P. Berrangé1-1/+1
The qemu_update_position method name gives the misleading impression that it is changing the current file offset. Most of the files are just streams, however, so there's no concept of a file offset in the general case. What this method is actually used for is to report on the number of bytes that have been transferred out of band from the main I/O methods. This new name better reflects this purpose. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22migration: switch to use QIOChannelNull for dummy channelDaniel P. Berrangé1-3/+4
This removes one further custom impl of QEMUFile, in favour of a QIOChannel based impl. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16multifd: multifd_send_sync_main now returns negative on errorLeonardo Bras1-7/+22
Even though multifd_send_sync_main() currently emits error_reports, it's callers don't really check it before continuing. Change multifd_send_sync_main() to return -1 on error and 0 on success. Also change all it's callers to make use of this change and possibly fail earlier. (This change is important to next patch on multifd zero copy implementation, to make it sure an error in zero-copy flush does not go unnoticed. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20220513062836.965425-7-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21migration: Fix operator typeDr. David Alan Gilbert1-1/+1
Clang spotted an & that should have been an &&; fix it. Reported by: David Binderman / https://gitlab.com/dcb Fixes: 65dacaa04fa ("migration: introduce save_normal_page()") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/963 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220406102515.96320-1-dgilbert@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21migration: Export ram_load_postcopy()Peter Xu1-1/+1
Will be reused in postcopy fast load thread. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-6-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-04-21migration: Add pss.postcopy_requested statusPeter Xu1-0/+6
This boolean flag shows whether the current page during migration is triggered by postcopy or not. Then in ram_save_host_page() and deeper stack we'll be able to have a reference on the priority of this page. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-4-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-21Use g_new() & friends where that makes obvious senseMarkus Armbruster1-1/+1
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
2022-03-02migration: Move static var in ram_block_from_stream() into globalPeter Xu1-4/+9
Static variable is very unfriendly to threading of ram_block_from_stream(). Move it into MigrationIncomingState. Make the incoming state pointer to be passed over to ram_block_from_stream() on both caller sites. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220301083925.33483-8-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02migration: Dump ramblock and offset too when non-same-page detectedPeter Xu1-2/+6
In ram_load_postcopy() we'll try to detect non-same-page case and dump error. This error is very helpful for debugging. Adding ramblock & offset into the error log too. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220301083925.33483-6-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fix up long line
2022-03-02migration: Introduce postcopy channels on dest nodePeter Xu1-24/+21
Postcopy handles huge pages in a special way that currently we can only have one "channel" to transfer the page. It's because when we install pages using UFFDIO_COPY, we need to have the whole huge page ready, it also means we need to have a temp huge page when trying to receive the whole content of the page. Currently all maintainance around this tmp page is global: firstly we'll allocate a temp huge page, then we maintain its status mostly within ram_load_postcopy(). To enable multiple channels for postcopy, the first thing we need to do is to prepare N temp huge pages as caching, one for each channel. Meanwhile we need to maintain the tmp huge page status per-channel too. To give some example, some local variables maintained in ram_load_postcopy() are listed; they are responsible for maintaining temp huge page status: - all_zero: this keeps whether this huge page contains all zeros - target_pages: this counts how many target pages have been copied - host_page: this keeps the host ptr for the page to install Move all these fields to be together with the temp huge pages to form a new structure called PostcopyTmpPage. Then for each (future) postcopy channel, we need one structure to keep the state around. For vanilla postcopy, obviously there's only one channel. It contains both precopy and postcopy pages. This patch teaches the dest migration node to start realize the possible number of postcopy channels by introducing the "postcopy_channels" variable. Its value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN phase). Vanilla postcopy will have channels=1, but when postcopy-preempt capability is enabled (in the future), we will boost it to 2 because even during partial sending of a precopy huge page we still want to preempt it and start sending the postcopy requested page right away (so we start to keep two temp huge pages; more if we want to enable multifd). In this patch there's a TODO marked for that; so far the channels is always set to 1. We need to send one "host huge page" on one channel only and we cannot split them, because otherwise the data upon the same huge page can locate on more than one channel so we need more complicated logic to manage. One temp host huge page for each channel will be enough for us for now. Postcopy will still always use the index=0 huge page even after this patch. However it prepares for the latter patches where it can start to use multiple channels (which needs src intervention, because only src knows which channel we should use). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220301083925.33483-5-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixed up long line
2022-02-21include: Move qemu_madvise() and related #defines to new qemu/madvise.hPeter Maydell1-0/+1
The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-01-28migration: Simplify unqueue_page()Peter Xu1-26/+11
This patch simplifies unqueue_page() on both sides of it (itself, and caller). Firstly, due to the fact that right after unqueue_page() returned true, we'll definitely send a huge page (see ram_save_huge_page() call - it will _never_ exit before finish sending that huge page), so unqueue_page() does not need to jump in small page size if huge page is enabled on the ramblock. IOW, it's destined that only the 1st 4K page will be valid, when unqueue the 2nd+ time we'll notice the whole huge page has already been sent anyway. Switching to operating on huge page reduces a lot of the loops of redundant unqueue_page(). Meanwhile, drop the dirty check. It's not helpful to call test_bit() every time to jump over clean pages, as ram_save_host_page() has already done so, while in a faster way (see commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()", 2021-05-13)). So that's not necessary too. Drop the two tracepoints along the way - based on above analysis it's very possible that no one is really using it.. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Add postcopy_has_request()Peter Xu1-17/+28
Add a helper to detect whether postcopy has pending request. Since at it, cleanup the code a bit, e.g. in unqueue_page() we shouldn't need to check it again on queue empty because we're the only one (besides cleanup code, which should never run during this process) that will take a request off the list, so the request list can only grow but not shrink under the hood. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: No off-by-one for pss->page update in host page sizePeter Xu1-2/+2
We used to do off-by-one fixup for pss->page when finished one host huge page transfer. That seems to be unnecesary at all. Drop it. Cc: Keqian Zhu <zhukeqian1@huawei.com> Cc: Kunkun Jiang <jiangkunkun@huawei.com> Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Tally pre-copy, downtime and post-copy bytes independentlyDavid Edmondson1-0/+7
Provide information on the number of bytes copied in the pre-copy, downtime and post-copy phases of migration. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Introduce ram_transferred_add()David Edmondson1-9/+14
Replace direct manipulation of ram_counters.transferred with a function. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Don't return for postcopy_send_discard_bm_ram()Philippe Mathieu-Daudé1-5/+1
postcopy_send_discard_bm_ram() always return zero. Since it can't fail, simplify and do not return anything. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Drop return code for disgard ram processPeter Xu1-15/+5
It will just never fail. Drop those return values where they're constantly zeros. A tiny touch-up on the tracepoint so trace_ram_postcopy_send_discard_bitmap() is called after the logic itself (which sounds more reasonable). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Do chunk page in postcopy_each_ram_send_discard()Peter Xu1-10/+10
Right now we loop ramblocks for twice, the 1st time chunk the dirty bits with huge page information; the 2nd time we send the discard ranges. That's not necessary - we can do them in a single loop. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Drop postcopy_chunk_hostpages()Peter Xu1-26/+7
This function calls three functions: - postcopy_discard_send_init(ms, block->idstr); - postcopy_chunk_hostpages_pass(ms, block); - postcopy_discard_send_finish(ms); However only the 2nd function call is meaningful. It's major role is to make sure dirty bits are applied in host-page-size granule, so there will be no partial dirty bits set for a whole host page if huge pages are used. The 1st/3rd call are for latter when we want to send the disgard ranges. They're mostly no-op here besides some tracepoints (which are misleading!). Drop them, then we can directly drop postcopy_chunk_hostpages() as a whole because we can call postcopy_chunk_hostpages_pass() directly. There're still some nice comments above postcopy_chunk_hostpages() that explain what it does. Copy it over to the caller's site. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Don't return for postcopy_chunk_hostpages()Peter Xu1-9/+2
It always return zero, because it just can't go wrong so far. Simplify the code with no functional change. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Drop dead code of ram_debug_dump_bitmap()Peter Xu1-39/+0
I planned to add "#ifdef DEBUG_POSTCOPY" around the function too because otherwise it'll be compiled into qemu binary even if it'll never be used. Then I found that maybe it's easier to just drop it for good.. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration/ram: clean up unused comment.Xu Zheng1-1/+0
Just a removal of an unused comment. a0a8aa147aa did many fixes and removed the parameter named "ms", but forget to remove the corresponding comment in function named "ram_save_host_page". Signed-off-by: Xu Zheng <xuzheng@cmss.chinamobile.com> Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2022-01-28migration: Move ram_release_pages() call to save_zero_page_to_file()Juan Quintela1-11/+10
We always need to call it when we find a zero page, so put it in a single place. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2022-01-28migration: simplify do_compress_ram_pageJuan Quintela1-8/+3
The goto is not needed at all. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28migration: Remove masking for compressionJuan Quintela1-2/+2
Remove the mask in the call to ram_release_pages(). Nothing else does it, and if the offset has that bits set, we have a lot of trouble. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-01-28migration: ram_release_pages() always receive 1 page as argumentJuan Quintela1-4/+4
Remove the pages argument. And s/pages/page/ Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> --- - Use 1LL instead of casts (philmd) - Change the whole 1ULL for TARGET_PAGE_SIZE
2022-01-28migration: We only need last_stage in two placesJuan Quintela1-23/+18
We only need last_stage in two places and we are passing it all around. Just add a field to RAMState that passes it. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> --- Repeat subject (philmd suggestion)
2021-12-15migration: Remove is_zero_range()Juan Quintela1-7/+2
It just calls buffer_is_zero(). Just change the callers. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2021-12-15migration/ram.c: Remove the qemu_mutex_lock in colo_flush_ram_cache.Rao, Lei1-2/+0
The code to acquire bitmap_mutex is added in the commit of "63268c4970a5f126cc9af75f3ccb8057abef5ec0". There is no need to acquire bitmap_mutex in colo_flush_ram_cache(). This is because the colo_flush_ram_cache only be called on the COLO secondary VM, which is the destination side. On the COLO secondary VM, only the COLO thread will touch the bitmap of ram cache. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-09Reset the auto-converge counter at every checkpoint.Rao, Lei1-0/+9
if we don't reset the auto-converge counter, it will continue to run with COLO running, and eventually the system will hang due to the CPU throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-09Reduce the PVM stop time during CheckpointRao, Lei1-3/+45
When flushing memory from ram cache to ram during every checkpoint on secondary VM, we can copy continuous chunks of memory instead of 4096 bytes per time to reduce the time of VM stop during checkpoint. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03colo: Don't dump colo cache if dump-guest-core=offLukas Straub1-0/+6
One might set dump-guest-core=off to make coredumps smaller and still allow to debug many qemu bugs. Extend this option to the colo cache. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03migration: provide an error message to migration_cancel()Laurent Vivier1-2/+1
This avoids to call migrate_get_current() in the caller function whereas migration_cancel() already needs the pointer to the current migration state. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshotsDavid Hildenbrand1-2/+36
We already don't ever migrate memory that corresponds to discarded ranges as managed by a RamDiscardManager responsible for the mapped memory region of the RAMBlock. virtio-mem uses this mechanism to logically unplug parts of a RAMBlock. Right now, we still populate zeropages for the whole usable part of the RAMBlock, which is undesired because: 1. Even populating the shared zeropage will result in memory getting consumed for page tables. 2. Memory backends without a shared zeropage (like hugetlbfs and shmem) will populate an actual, fresh page, resulting in an unintended memory consumption. Discarded ("logically unplugged") parts have to remain discarded. As these pages are never part of the migration stream, there is no need to track modifications via userfaultfd WP reliably for these parts. Further, any writes to these ranges by the VM are invalid and the behavior is undefined. Note that Linux only supports userfaultfd WP on private anonymous memory for now, which usually results in the shared zeropage getting populated. The issue will become more relevant once userfaultfd WP supports shmem and hugetlb. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01migration/ram: Factor out populating pages readable in ↵David Hildenbrand1-13/+22
ram_block_populate_pages() Let's factor out prefaulting/populating to make further changes easier to review and add a comment what we are actually expecting to happen. While at it, use the actual page size of the ramblock, which defaults to qemu_real_host_page_size for anonymous memory. Further, rename ram_block_populate_pages() to ram_block_populate_read() as well, to make it clearer what we are doing. In the future, we might want to use MADV_POPULATE_READ to speed up population. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01migration: Simplify alignment and alignment checksDavid Hildenbrand1-1/+1
Let's use QEMU_ALIGN_DOWN() and friends to make the code a bit easier to read. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destinationDavid Hildenbrand1-0/+21
Currently, when someone (i.e., the VM) accesses discarded parts inside a RAMBlock with a RamDiscardManager managing the corresponding mapped memory region, postcopy will request migration of the corresponding page from the source. The source, however, will never answer, because it refuses to migrate such pages with undefined content ("logically unplugged"): the pages are never dirty, and get_queued_page() will consequently skip processing these postcopy requests. Especially reading discarded ("logically unplugged") ranges is supposed to work in some setups (for example with current virtio-mem), although it barely ever happens: still, not placing a page would currently stall the VM, as it cannot make forward progress. Let's check the state via the RamDiscardManager (the state e.g., of virtio-mem is migrated during precopy) and avoid sending a request that will never get answered. Place a fresh zero page instead to keep the VM working. This is the same behavior that would happen automatically without userfaultfd being active, when accessing virtual memory regions without populated pages -- "populate on demand". For now, there are valid cases (as documented in the virtio-mem spec) where a VM might read discarded memory; in the future, we will disallow that. Then, we might want to handle that case differently, e.g., warning the user that the VM seems to be mis-behaving. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration sourceDavid Hildenbrand1-0/+77
We don't want to migrate memory that corresponds to discarded ranges as managed by a RamDiscardManager responsible for the mapped memory region of the RAMBlock. The content of these pages is essentially stale and without any guarantees for the VM ("logically unplugged"). Depending on the underlying memory type, even reading memory might populate memory on the source, resulting in an undesired memory consumption. Of course, on the destination, even writing a zeropage consumes memory, which we also want to avoid (similar to free page hinting). Currently, virtio-mem tries achieving that goal (not migrating "unplugged" memory that was discarded) by going via qemu_guest_free_page_hint() - but it's hackish and incomplete. For example, background snapshots still end up reading all memory, as they don't do bitmap syncs. Postcopy recovery code will re-add previously cleared bits to the dirty bitmap and migrate them. Let's consult the RamDiscardManager after setting up our dirty bitmap initially and when postcopy recovery code reinitializes it: clear corresponding bits in the dirty bitmaps (e.g., of the RAMBlock and inside KVM). It's important to fixup the dirty bitmap *after* our initial bitmap sync, such that the corresponding dirty bits in KVM are actually cleared. As colo is incompatible with discarding of RAM and inhibits it, we don't have to bother. Note: if a misbehaving guest would use discarded ranges after migration started we would still migrate that memory: however, then we already populated that memory on the migration source. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01memory: make global_dirty_tracking a bitmaskHyman Huang(黄勇)1-4/+11
since dirty ring has been introduced, there are two methods to track dirty pages of vm. it seems that "logging" has a hint on the method, so rename the global_dirty_log to global_dirty_tracking would make description more accurate. dirty rate measurement may start or stop dirty tracking during calculation. this conflict with migration because stop dirty tracking make migration leave dirty pages out then that'll be a problem. make global_dirty_tracking a bitmask can let both migration and dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty tracking aims for, migration or dirty rate. Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19migration/ram: Don't passs RAMState to ↵David Hildenbrand1-8/+5
migration_clear_memory_region_dirty_bitmap_*() The parameter is unused, let's drop it. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-07-26migration: clear the memory region dirty bitmap when skipping free pagesWei Wang1-18/+56
When skipping free pages to send, their corresponding dirty bits in the memory region dirty bitmap need to be cleared. Otherwise the skipped pages will be sent in the next round after the migration thread syncs dirty bits from the memory region dirty bitmap. Cc: David Hildenbrand <david@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Message-Id: <20210722083055.23352-1-wei.w.wang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26migration: Teach QEMUFile to be QIOChannel-awarePeter Xu1-1/+1
migration uses QIOChannel typed qemufiles. In follow up patches, we'll need the capability to identify this fact, so that we can get the backing QIOChannel from a QEMUFile. We can also define types for QEMUFile but so far since we only need to be able to identify QIOChannel, introduce a boolean which is simpler. Introduce another helper qemu_file_get_ioc() to return the ioc backend of a qemufile if has_ioc is set. No functional change. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210722175841.938739-5-peterx@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>