Age | Commit message (Collapse) | Author | Files | Lines |
|
It converts the old if clauses into switch, explicitly mentions the
possible migration states. The old nested "if"s are not clear on what
we do on different states.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Generalize the calculation part when migration complete into a
function to simplify migration_thread().
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Introduce MigrationState.downtime_start to replace the local variable
"start_time" in migration_thread to avoid passing things around.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Firstly, it was passed around. Let's just move it into MigrationState
just like many other variables as state of migration, renaming it to
vm_was_running.
One thing to mention is that for postcopy, we actually don't need this
knowledge at all since postcopy can't resume a VM even if it fails (we
can see that from the old code too: when we try to resume we also check
against "entered_postcopy" variable). So further we do this:
- in postcopy_start(), we don't update vm_old_running since useless
- in migration_thread(), we don't need to check entered_postcopy when
resume, since it's only used for precopy.
Comment this out too for that variable definition.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
It was used either to:
1. store initial timestamp of migration start, and
2. store total time used by last migration
Let's provide two parameters for each of them. Mix use of the two is
slightly misleading.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
It's only used once, clean it up a bit.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Moving existing callers all into migrate_fd_cleanup(). It simplifies
migration_thread() a bit.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
When reaching here if we are still "active" it means we must be in colo
state. After a quick discussion offlist, we decided to use the safer
error_report().
Finally I want to use "switch" here rather than lots of complicated if
clauses.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
current_migration has .instance_finalize callback, but it is not
called, because nobody unrefs current_migration. Fix that.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Calling ram_bytes_remaining during the early part of setup is unsafe
because the ram_state isn't yet initialised.
This can happen in the sequence:
migrate
migrate_cancel
info migrate
if the migrate sticks trying to connect (e.g. to an unresponsive
destination due to the connect timeout). Here 'info migrate' sees
a state of CANCELLING and so assumes the migrate has partially happened.
partial fix for:
RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1525899
Reported-by: Xianxian Wang <xianwang@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Postcopy total blocktime is available on destination side only.
But query-migrate was possible only for source. This patch
adds ability to call query-migrate on destination.
To be able to see postcopy blocktime, need to request postcopy-blocktime
capability.
The query-migrate command will show following sample result:
{"return":
"postcopy-vcpu-blocktime": [115, 100],
"status": "completed",
"postcopy-blocktime": 100
}}
postcopy_vcpu_blocktime contains list, where the first item is the first
vCPU in QEMU.
This patch has a drawback, it combines states of incoming and
outgoing migration. Ongoing migration state will overwrite incoming
state. Looks like better to separate query-migrate for incoming and
outgoing migration or add parameter to indicate type of migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
This patch just requests blocktime calculation,
and check it in case when UFFD_FEATURE_THREAD_ID feature is set
on the host.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
This patch provides blocktime calculation per vCPU,
as a summary and as a overlapped value for all vCPUs.
This approach was suggested by Peter Xu, as an improvements of
previous approch where QEMU kept tree with faulted page address and cpus bitmask
in it. Now QEMU is keeping array with faulted page address as value and vCPU
as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
list for blocktime per vCPU (could be traced with page_fault_addr)
Blocktime will not calculated if postcopy_blocktime field of
MigrationIncomingState wasn't initialized.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
This patch adds request to kernel space for UFFD_FEATURE_THREAD_ID, in
case this feature is provided by kernel.
PostcopyBlocktimeContext is encapsulated inside postcopy-ram.c,
due to it being a postcopy-only feature.
Also it defines PostcopyBlocktimeContext's instance live time.
Information from PostcopyBlocktimeContext instance will be provided
much after postcopy migration end, instance of PostcopyBlocktimeContext
will live till QEMU exit, but part of it (vcpu_addr,
page_fault_vcpu_time) used only during calculation, will be released
when postcopy ended or failed.
To enable postcopy blocktime calculation on destination, need to
request proper compatibility (Patch for documentation will be at the
tail of the patch set).
As an example following command enable that capability, assume QEMU was
started with
-chardev socket,id=charmonitor,path=/var/lib/migrate-vm-monitor.sock
option to control it
[root@host]#printf "{\"execute\" : \"qmp_capabilities\"}\r\n \
{\"execute\": \"migrate-set-capabilities\" , \"arguments\": {
\"capabilities\": [ { \"capability\": \"postcopy-blocktime\", \"state\":
true } ] } }" | nc -U /var/lib/migrate-vm-monitor.sock
Or just with HMP
(qemu) migrate_set_capability postcopy-blocktime on
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Right now it could be used on destination side to
enable vCPU blocktime calculation for postcopy live migration.
vCPU blocktime - it's time since vCPU thread was put into
interruptible sleep, till memory page was copied and thread awake.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Since commit 3a38429748 ("Add a "no HPT" encoding to HTAB migration stream")
the HTAB migration stream contains a header set to "-1", meaning there
is no HPT. Teach analyze-migration.py to ignore the section in this case.
Without this fix, the script fails with a dump from a POWER9 guest:
Traceback (most recent call last):
File "./qemu/scripts/analyze-migration.py", line 602, in <module>
dump.read(dump_memory = args.memory)
File "./qemu/scripts/analyze-migration.py", line 539, in read
section.read()
File "./qemu/scripts/analyze-migration.py", line 250, in read
self.file.readvar(n_valid * self.HASH_PTE_SIZE_64)
File "./qemu/scripts/analyze-migration.py", line 64, in readvar
raise Exception("Unexpected end of %s at 0x%x" % (self.filename, self.file.tell()))
Exception: Unexpected end of migrate.dump at 0x1d4763ba
Fixes: 3a38429748 ("Add a "no HPT" encoding to HTAB migration stream")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Peter Xu <peterx@redhat.com>
|
|
Mostly just manual conversion with very minor fixes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Otherwise, we can't use it after calling socket_start_incoming_migration
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
|
|
Once there, do one thing for line
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
|
|
We use int for everything (int64_t), and then we check that value is
between 0 and 255. Change it to the valid types.
This change only happens for HMP. QMP always use bytes and similar.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
staging
slirp updates
# gpg: Signature made Sun 14 Jan 2018 17:19:24 GMT
# gpg: using RSA key 0x996849C1CF560478
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg: aka "Samuel Thibault <sthibault@debian.org>"
# gpg: aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg: aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg: aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg: aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82 304B D017 8C76 7D06 9EE6
# Subkey fingerprint: 3A3A 5D46 4660 E867 610C A427 9968 49C1 CF56 0478
* remotes/thibault/tags/samuel-thibault:
slirp: add in6_dhcp_multicast()
slirp: removed unused code
slirp: remove unnecessary struct declaration
slirp: remove unused header
slirp: avoid IN6_IS_ADDR_UNSPECIFIED(), rather use in6_zero()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
Host: Mac OS 10.12.5
Compiler: Apple LLVM version 8.1.0 (clang-802.0.42)
slirp/ip6_icmp.c:80:38: warning: taking address of packed member 'ip_src' of class or
structure 'ip6' may result in an unaligned pointer value
[-Waddress-of-packed-member]
IN6_IS_ADDR_UNSPECIFIED(&ip->ip_src)) {
^~~~~~~~~~
/usr/include/netinet6/in6.h:238:42: note: expanded from macro 'IN6_IS_ADDR_UNSPECIFIED'
((*(const __uint32_t *)(const void *)(&(a)->s6_addr[0]) == 0) && \
^
Reported-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
into staging
sdl2: bugfixes.
spice: cleanups.
input: mem leak fix.
gtk: deprecate 2.x support.
# gpg: Signature made Fri 12 Jan 2018 14:52:39 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20180112-pull-request:
sdl2: Ignore UI hotkeys after a focus change when GUI modifier is held
sdl2 uses surface relative coordinates
sdl2: Do not hide the cursor on auxilliary windows
spice: remove unused timer list
spice: remove only written event_mask field
spice: remove unused watch list
spice: remove QXLWorker interface field
ui: deprecate use of GTK 2.x in favour of 3.x series
input: fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
into staging
vnc: limit memory usage (CVE-2017-15124)
# gpg: Signature made Fri 12 Jan 2018 12:57:22 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/vnc-20180112-pull-request:
ui: mix misleading comments & return types of VNC I/O helper methods
ui: add trace events related to VNC client throttling
ui: place a hard cap on VNC server output buffer size
ui: fix VNC client throttling when forced update is requested
ui: fix VNC client throttling when audio capture is active
ui: refactor code for determining if an update should be sent to the client
ui: correctly reset framebuffer update state after processing dirty regions
ui: introduce enum to track VNC client framebuffer update request state
ui: track how much decoded data we consumed when doing SASL encoding
ui: avoid pointless VNC updates if framebuffer isn't dirty
ui: remove redundant indentation in vnc_client_update
ui: remove unreachable code in vnc_update_client
ui: remove 'sync' parameter from vnc_update_client
vnc: fix debug spelling
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
When SDL2 windows change focus while a key is held, the window that
receives the focus also receives a new KeyDown event, without an
autorepeat flag. This means that if a WM places the qemu console
over the main window after Ctrl-Alt-2, the console closes immediately
after opening. Then, the main window receives the KeyDown event again
and the whole process repeats.
This patch makes the SDL2 UI ignore the KeyDown events on a window that
just received the focus, if the GUI modifier was held. The ignore flag
is reset on a first KeyUp event. This effectively works around the issue
above.
Signed-off-by: Jindrich Makovicka <makovick@gmail.com>
Message-Id: <20171117112258.5888-4-makovick@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
This patch fixes mouse positioning with -device usb-tablet and fullscreen
or resized window.
Fixes: 46522a82236ea0cf9011b89896d2d8f8ddaf2443
Signed-off-by: Jindrich Makovicka <makovick@gmail.com>
Message-Id: <20171117112258.5888-3-makovick@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
Signed-off-by: Jindrich Makovicka <makovick@gmail.com>
Message-Id: <20171117112258.5888-2-makovick@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
Some older versions of gcc complain if a typedef is defined twice:
target/xtensa/translate.c:81: error: redefinition of typedef 'DisasContext'
target/xtensa/cpu.h:339: note: previous declaration of 'DisasContext' was here
Remove the now-redundant typedef from the definition of the struct in
translate.c.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1515762528-22818-1-git-send-email-peter.maydell@linaro.org
|
|
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Message-id: 20171122135625.16625-4-fziglio@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Message-id: 20171122135625.16625-3-fziglio@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Message-id: 20171122135625.16625-2-fziglio@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
This fields points to an old interface that is no more
used in the current code.
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Message-id: 20171122135625.16625-1-fziglio@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The GTK 3.0 release was made in Feb, 2011:
https://blog.gtk.org/2011/02/10/gtk-3-0-released/
That will soon be 7 years ago, which is enough time to consider
the 3.x series widely supported.
Thus we deprecate the GTK 2.x support, which will allow us to
delete it in the last release of 2018. By this time, GTK 3.x
will be almost 8 years old.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171212113440.16483-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
If kbd_queue is not empty and queue_count >= queue_limit,
we should free evt.
Change-Id: Ieeacf90d5e7e370a40452ec79031912d8b864d83
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20171225023730.5512-1-linzhecheng@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
While the QIOChannel APIs for reading/writing data return ssize_t, with negative
value indicating an error, the VNC code passes this return value through the
vnc_client_io_error() method. This detects the error condition, disconnects the
client and returns 0 to indicate error. Thus all the VNC helper methods should
return size_t (unsigned), and misleading comments which refer to the possibility
of negative return values need fixing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-14-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The VNC client throttling is quite subtle so will benefit from having trace
points available for live debugging.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-13-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.
There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.
This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.
To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.
This completes the fix for CVE-2017-15124 started in the previous patches.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).
The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check is disabled if the client has requested
a forced update, because we want to send these as soon as possible.
As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then repeatedly send full framebuffer update
requests, but never read data back from the server. This can easily make QEMU's
VNC server send buffer consume 100MB of RAM per second, until the OOM killer
starts reaping processes (hopefully the rogue QEMU process, but it might pick
others...).
To address this we make the throttling more intelligent, so we can throttle
full updates. When we get a forced update request, we keep track of exactly how
much data we put on the output buffer. We will not process a subsequent forced
update request until this data has been fully sent on the wire. We always allow
one forced update request to be in flight, regardless of what data is queued
for incremental updates or audio data. The slight complication is that we do
not initially know how much data an update will send, as this is done in the
background by the VNC job thread. So we must track the fact that the job thread
has an update pending, and not process any further updates until this job is
has been completed & put data on the output buffer.
This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.
This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:
commit a7b20a8efa28e5f22c26c06cd06c2f12bc863493
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Oct 9 14:43:42 2017 +0100
io: monitor encoutput buffer size from websocket GSource
This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-11-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).
The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.
As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).
To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.
If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.
This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.
This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:
commit a7b20a8efa28e5f22c26c06cd06c2f12bc863493
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Oct 9 14:43:42 2017 +0100
io: monitor encoutput buffer size from websocket GSource
This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The logic for determining if it is possible to send an update to the client
will become more complicated shortly, so pull it out into a separate method
for easier extension later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-9-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
According to the RFB protocol, a client sends one or more framebuffer update
requests to the server. The server can reply with a single framebuffer update
response, that covers all previously received requests. Once the client has
read this update from the server, it may send further framebuffer update
requests to monitor future changes. The client is free to delay sending the
framebuffer update request if it needs to throttle the amount of data it is
reading from the server.
The QEMU VNC server, however, has never correctly handled the framebuffer
update requests. Once QEMU has received an update request, it will continue to
send client updates forever, even if the client hasn't asked for further
updates. This prevents the client from throttling back data it gets from the
server. This change fixes the flawed logic such that after a set of updates are
sent out, QEMU waits for a further update request before sending more data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
Currently the VNC servers tracks whether a client has requested an incremental
or forced update with two boolean flags. There are only really 3 distinct
states to track, so create an enum to more accurately reflect permitted states.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-7-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
When we encode data for writing with SASL, we encode the entire pending output
buffer. The subsequent write, however, may not be able to send the full encoded
data in one go though, particularly with a slow network. So we delay setting the
output buffer offset back to zero until all the SASL encoded data is sent.
Between encoding the data and completing sending of the SASL encoded data,
however, more data might have been placed on the pending output buffer. So it
is not valid to set offset back to zero. Instead we must keep track of how much
data we consumed during encoding and subtract only that amount.
With the current bug we would be throwing away some pending data without having
sent it at all. By sheer luck this did not previously cause any serious problem
because appending data to the send buffer is always an atomic action, so we
only ever throw away complete RFB protocol messages. In the case of frame buffer
updates we'd catch up fairly quickly, so no obvious problem was visible.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
The vnc_update_client() method checks the 'has_dirty' flag to see if there are
dirty regions that are pending to send to the client. Regardless of this flag,
if a forced update is requested, updates must be sent. For unknown reasons
though, the code also tries to sent updates if audio capture is enabled. This
makes no sense as audio capture state does not impact framebuffer contents, so
this check is removed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|