Age | Commit message (Collapse) | Author | Files | Lines |
|
The memory_region_tb_read tracepoint is unreachable, since notdirty
is supposed to apply only to writes. The memory_region_tb_write
tracepoint is mis-named, because notdirty is not only used for TB
invalidation. It is also used for e.g. VGA RAM updates and migration.
Replace memory_region_tb_write with memory_notdirty_write_access,
and place it in memory_notdirty_write_prepare where it can catch
all of the instances. Add memory_notdirty_set_dirty to log when
we no longer intercept writes to a page.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
|
|
Create a new monitor/ subdirectory and move monitor.c there. As the plan
is to move the monitor core into separate files, use the chance to
rename it to misc.c.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190613153405.24769-8-kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
Some trace points are attributed to the wrong source file. Happens
when we neglect to update trace-events for code motion, or add events
in the wrong place, or misspell the file name.
Clean up with help of cleanup-trace-events.pl. Same funnies as in the
previous commit, of course. Manually shorten its change to
linux-user/trace-events to */signal.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190314180929.27722-6-armbru@redhat.com
Message-Id: <20190314180929.27722-6-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Almost all trace-events point to docs/devel/tracing.txt in a comment
right at the beginning. Touch up the ones that don't.
[Updated with Markus' new commit description wording.
--Stefan]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-2-armbru@redhat.com
Message-Id: <20190314180929.27722-2-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Trace previous state, move tracepoint to runstate_set start (to cover
all cases for debugging), add string representations of traced states.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190124125154.474650-1-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Jobs are now expected to return their retcode on the stack, from the
.run callback, so we can remove that argument.
job_cancel does not need to set -ECANCELED because job_completed will
update the return code itself if the job was canceled.
While we're here, make job_completed static to job.c and remove it from
job.h; move the documentation of return code to the .run() callback and
to the job->ret property, accordingly.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20180830015734.19765-9-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
This adds QMP commands that control the transition between states of the
job lifecycle.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This moves the top-level job completion and cancellation functions from
BlockJob to Job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This moves BlockJob.status and the closely related functions
(block_)job_state_transition() and (block_)job_apply_verb to Job. The
two QAPI enums are renamed to JobStatus and JobVerb.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
virtio,vhost,pci,pc: features, cleanups
SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (51 commits)
postcopy shared docs
libvhost-user: Claim support for postcopy
postcopy: Allow shared memory
vhost: Huge page align and merge
vhost+postcopy: Wire up POSTCOPY_END notify
vhost-user: Add VHOST_USER_POSTCOPY_END message
libvhost-user: mprotect & madvises for postcopy
vhost+postcopy: Call wakeups
vhost+postcopy: Add vhost waker
postcopy: postcopy_notify_shared_wake
postcopy: helper for waking shared
vhost+postcopy: Resolve client address
postcopy-ram: add a stub for postcopy_request_shared_page
vhost+postcopy: Helper to send requests to source for shared pages
vhost+postcopy: Stash RAMBlock and offset
vhost+postcopy: Send address back to qemu
libvhost-user+postcopy: Register new regions with the ufd
migration/ram: ramblock_recv_bitmap_test_byte_offset
postcopy+vhost-user: Split set_mem_table for postcopy
vhost+postcopy: Transmit 'listen' to slave
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# scripts/update-linux-headers.sh
|
|
The choice of call to discard a block is getting more complicated
for other cases. We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.
Care should be taken when trying other backing files.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Having "allow-oob":true for a command does not mean that this command
will always be run in out-of-band mode. The out-of-band quick path will
only be executed if we specify the extra "run-oob" flag when sending the
QMP request:
{ "execute": "command-that-allows-oob",
"arguments": { ... },
"control": { "run-oob": true } }
The "control" key is introduced to store this extra flag. "control"
field is used to store arguments that are shared by all the commands,
rather than command specific arguments. Let "run-oob" be the first.
Note that in the patch I exported qmp_dispatch_check_obj() to be used to
check the request earlier, and at the same time allowed "id" field to be
there since actually we always allow that.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-19-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to(), spelling fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
This patches allows QMP monitors to be suspended/resumed.
One thing to mention is that for QMPs that are using IOThreads, we need
an explicit kick for the IOThread in case it is sleeping.
Meanwhile, we need to take special care on non-interactive HMPs.
Currently only gdbserver is using that. For these monitors, we still
don't allow suspend/resume operations.
Since at it, add traces for the operations.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-14-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
Any compound structs / unions / etc, should always be declared as
'void *' pointers, since it cannot be assumed that trace backends
are able to resolve QEMU typedefs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180308155524.5082-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
SystemTap's dtrace(1) produces the following warning when it encounters
"char const" instead of "const char":
Warning: /usr/bin/dtrace:trace-dtrace-root.dtrace:66: syntax error near:
probe flatview_destroy_rcu
Warning: Proceeding as if --no-pyparsing was given.
This is a limitation in current SystemTap releases. I have sent a patch
upstream to accept "char const" since it is valid C:
https://sourceware.org/ml/systemtap/2018-q1/msg00017.html
In QEMU we still wish to avoid warnings in the current SystemTap
release. It's simple enough to replace "char const" with "const char".
I'm not changing the documentation or implementing checks to prevent
this from occurring again in the future. The next release of SystemTap
will hopefully resolve this issue.
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180201162625.4276-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Add some comments so I can understand the various nested loops.
Add some tracing so I can see what they're doing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-2-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Doug Gale <doug16k@gmail.com>
Message-id: 20171203013037.31978-1-doug16k@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In trace format '#' flag of printf is forbidden. Fix it to '0x%'.
This patch is created by the following:
check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56
check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0
check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0
fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'
[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The existing optimizations makes it feasible to have them available on all
builds.
Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input
(medium size - suns.pl) and the guest_mem_before event:
* vanilla, statically disabled
real 0m2,259s
user 0m2,252s
sys 0m0,004s
* vanilla, statically enabled (overhead: 2.18x)
real 0m4,921s
user 0m4,912s
sys 0m0,008s
* multi-tb, statically disabled (overhead: 0.99x) [within noise range]
real 0m2,228s
user 0m2,216s
sys 0m0,008s
* multi-tb, statically enabled (overhead: 0.99x) [within noise range]
real 0m2,229s
user 0m2,224s
sys 0m0,004s
Now enabling all events when booting an ARM system that immediately shuts down
(https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg04085.html):
* vanilla, statically disabled
real 0m32,153s
user 0m31,276s
sys 0m0,108s
* vanilla, statically enabled (overhead: 1.35x)
real 0m43,507s
user 0m42,680s
sys 0m0,168s
* multi-tb, statically disabled (overhead: 1.03x)
real 0m32,993s
user 0m32,516s
sys 0m0,104s
* multi-tb, statically enabled (overhead: 1.00x) [within noise range]
real 0m32,110s
user 0m31,176s
sys 0m0,156s
And finally enabling all events using Emilio's dbt-bench
(where orig == vanilla, new == multi-tb):
NBench score; higher is better
180 +-+--------+----------+----------+---------+----------+----------+----------+----------+----------+---------+----------+--------+-+
| |
| *** $$$$%% orig |
160 +-+....................................*.*.$..$.%............................................................orig-enabled +-+
| * * $ $ % new |
140 +-+....................................*.*.$..$.%............................................................new-disabled.......+-+
| * * $ $ % |
| * * $ $ % |
120 +-+....................................*.*.$..$.%...............................................................................+-+
| * * $ $ % |
| * * $ $ % |
100 +-+....................................*.*.$..$.%.....$$$%%%....................................................................+-+
| * * $ $ % *** $ $ % *** $$$%% |
80 +-+....................................*.*.$..$.%.*.*.$.$..%.*.*.$.$.%..........................................................+-+
| * * $ $ % * * $ $ % * * $ $ % |
| * * $ $ % * * $ $ % * * $ $ % |
60 +-+.........................***..$$$%%.*.*##..$.%.*.*.$.$..%.*.*.$.$.%..***.$$$%%...............................................+-+
| **** $$$%% * * $ $ % * * # $ % * *## $ % * * $ $ % * * $ $ % |
| * * $ $ % * * $ $ % * * # $ % * * # $ % * *## $ % * * $ $ % |
40 +-+..............*..*.$.$.%.*.*..$.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.$.$.%...............................................+-+
| * * $ $ % * * $ $ % * * # $ % * * # $ % * * # $ % * *## $ % *** $$$%%% |
20 +-+....***.$$$%%.*..*##.$.%.*.*###.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.#.$.%..................................*.*.$.$..%...+-+
| * *## $ % * * # $ % * * # $ % * * # $ % * * # $ % * * # $ % * * # $ % * *## $ % |
| * * # $ % * * # $ % * * # $ % * * # $ % * * # $ % * * # $ % * * # $ % ***###$$%% ***##$$$%% * * # $ % |
0 +-+----***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%-***##$$%%--***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%---+-+
NUMERIC SORTSTRING SORT BITFIEFP EMULATION ASSIGNMENT IDEA HUFFMAN FOURIER NEURLU DECOMPOSITION gmean
png: http://imgur.com/a/8XG5S
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915849243.6295.4484103824675839071.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
move kvm related accelerator files into accel/ subdirectory, also
create one stub subdirectory, which will include accelerator's stub
files.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1496383606-18060-5-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
move tcg-runtime.c, translate-all.(ch) and translate-common.c into
accel/tcg/ subdirectory and updated related trace-events file.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1496383606-18060-4-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
move cputlb.c, cpu-exec-common.c and cpu-exec.c related tcg exec
file into accel/tcg/ subdirectory.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1496383606-18060-3-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Commit 104fc3027960dd2aa9d310936a6cb201c60e1088 ("qmp: Drop duplicated
QMP command object checks") removed the call to
trace_handle_qmp_command() while eliminating code duplication.
This patch brings the trace event back so QEMU-internal trace events can
be correlated with the QMP commands that caused them.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170605104216.22429-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
It is often useful to correlate QEMU-internal events with monitor
commands that caused them. Trace the full HMP command being executed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170605104216.22429-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
It'll be nice to know which virq belongs to which device/vector when
adding msi routes, so adding two more parameters for the add trace.
Meanwhile, releasing virq has no tracing before. Add one for it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494309644-18743-2-git-send-email-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Suggested by Paolo Bonzini during series review.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
|
|
Time to wire up all the call sites that request a shutdown or
reset to use the enum added in the previous patch.
It would have been less churn to keep the common case with no
arguments as meaning guest-triggered, and only modified the
host-triggered code paths, via a wrapper function, but then we'd
still have to audit that I didn't miss any host-triggered spots;
changing the signature forces us to double-check that I correctly
categorized all callers.
Since command line options can change whether a guest reset request
causes an actual reset vs. a shutdown, it's easy to also add the
information to reset requests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts]
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts]
Message-Id: <20170515214114.15442-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
|
|
move xen-mapcache.c to hw/i386/xen/
Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
|
|
move xen-hvm.c to hw/i386/xen/
Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
|
|
qmp_check_input_obj() duplicates qmp_dispatch_check_obj(), except the
latter screws up an error message. handle_qmp_command() runs first
the former, then the latter via qmp_dispatch(), masking the screwup.
qemu-ga also masks the screwup, because it also duplicates checks,
just differently.
qmp_check_input_obj() exists because handle_qmp_command() needs to
examine the command before dispatching it. The previous commit got
rid of this need, except for a tracepoint, and a bit of "id" code that
relies on qdict not being null.
Fix up the error message in qmp_dispatch_check_obj(), drop
qmp_check_input_obj() and the tracepoint. Protect the "id" code with
a conditional.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-9-git-send-email-armbru@redhat.com>
|
|
AioContext is fairly self contained, the only dependency is QEMUTimer but
that in turn doesn't need anything else. So move them out of block-obj-y
to avoid introducing a dependency from io/ to block-obj-y.
main-loop and its dependency iohandler also need to be moved, because
later in this series io/ will call iohandler_get_aio_context.
[Changed copyright "the QEMU team" to "other QEMU contributors" as
suggested by Daniel Berrange and agreed by Paolo.
--Stefan]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
There are a number of unused trace events that
scripts/cleanup-trace-events.pl finds. The "hw/vfio/pci-quirks.c"
filename was typoed and "qapi/qapi-visit-core.c" was missing the qapi/
directory prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170126171613.1399-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The trace-events for a given source file should generally
always live in the same directory as the source file.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170125161417.31949-4-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Signals the hot-unplugging of a virtual (guest) CPU.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 148278748597.1404.10546320797997984932.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20161212221759.28949-1-marcandre.lureau@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
This patch is based on the algorithm for the kvm.ko halt_poll_ns
parameter in Linux. The initial polling time is zero.
If the event loop is woken up within the maximum polling time it means
polling could be effective, so grow polling time.
If the event loop is woken up beyond the maximum polling time it means
polling is not effective, so shrink polling time.
If the event loop makes progress within the current polling time then
the sweet spot has been reached.
This algorithm adjusts the polling time so it can adapt to variations in
workloads. The goal is to reach the sweet spot while also recognizing
when polling would hurt more than help.
Two new trace events, poll_grow and poll_shrink, are added for observing
polling time adjustment.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-13-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The AioContext event loop uses ppoll(2) or epoll_wait(2) to monitor file
descriptors or until a timer expires. In cases like virtqueues, Linux
AIO, and ThreadPool it is technically possible to wait for events via
polling (i.e. continuously checking for events without blocking).
Polling can be faster than blocking syscalls because file descriptors,
the process scheduler, and system calls are bypassed.
The main disadvantage to polling is that it increases CPU utilization.
In classic polling configuration a full host CPU thread might run at
100% to respond to events as quickly as possible. This patch implements
a timeout so we fall back to blocking syscalls if polling detects no
activity. After the timeout no CPU cycles are wasted on polling until
the next event loop iteration.
The run_poll_handlers_begin() and run_poll_handlers_end() trace events
are added to aid performance analysis and troubleshooting. If you need
to know whether polling mode is being used, trace these events to find
out.
Note that the AioContext is now re-acquired before disabling notify_me
in the non-polling case. This makes the code cleaner since notify_me
was enabled outside the non-polling AioContext release region. This
change is correct since it's safe to keep notify_me enabled longer
(disabling is an optimization) but potentially causes unnecessary
event_notifer_set() calls. I think the chance of performance regression
is small here.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.
This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.
To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.
With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add missing execution mode documentation for the 'guest_cpu_enter' and
'guest_cpu_reset' events.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 147566900921.7708.656450813307396468.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The colo patch series added various trace events to the top
level trace-events file, despite the files using them being
in a sub-dir.
commit 30656b097e9dd7978d3fe9416cb9f5a421a9e63e
Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Date: Tue Sep 27 10:22:34 2016 +0800
filter-rewriter: rewrite tcp packet to keep secondary connection
commit f4b618360e5a81b097e2e35d52011bec3c63af68
Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Date: Tue Sep 27 10:22:31 2016 +0800
colo-compare: add TCP, UDP, ICMP packet comparison
We add TCP,UDP,ICMP packet comparison to replace
IP packet comparison. This can increase the
accuracy of the package comparison.
Less checkpoint more efficiency.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
commit 0682e15b19b2f41c0568142b42518b9471168597
Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Date: Tue Sep 27 10:22:30 2016 +0800
colo-compare: introduce packet comparison thread
commit 59509ec16b7ee92b3f8261c554023aa1d3169317
Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Date: Tue Sep 27 10:22:27 2016 +0800
net/colo.c: add colo.c to define and handle packet
This moves all events into net/trace-events where they
were supposed to live.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1475588159-30598-2-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Explicitly state in which execution mode (user, softmmu, all) are guest
events available for tracing.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 147456962135.11114.6146034359114598596.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Signals the reset of the state a virtual (guest) CPU.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 147428971851.15111.8799439252178273840.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Signals the hot-plugging of a new virtual (guest) CPU.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 147428971313.15111.18023030883528426840.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The trace points for hw/virtio/virtio-balloon.c were mistakenly put
in the top level trace-events file, instead of util/trace-events in
commit 270ab88f7c1112389a02cee0e3e03b20fcc7547e
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Jun 16 09:39:57 2016 +0100
trace: split out trace events for hw/virtio/ directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473872624-23285-5-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The trace points for util/qemu-coroutine*.c were mistakenly left
in the top level trace-events file, instead of util/trace-events
in
commit 492bb2dd651e780c0723580880acbedb5661e5ad
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Jun 16 09:39:48 2016 +0100
trace: split out trace events for util/ directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473872624-23285-3-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
We will rewrite tcp packet secondary received and sent.
When colo guest is a tcp server.
Firstly, client start a tcp handshake. the packet's seq=client_seq,
ack=0,flag=SYN. COLO primary guest get this pkt and mirror(filter-mirror)
to secondary guest, secondary get it use filter-redirector.
Then,primary guest response pkt
(seq=primary_seq,ack=client_seq+1,flag=ACK|SYN).
secondary guest response pkt
(seq=secondary_seq,ack=client_seq+1,flag=ACK|SYN).
In here,we use filter-rewriter save the secondary_seq to it's tcp connection.
Finally handshake,client send pkt
(seq=client_seq+1,ack=primary_seq+1,flag=ACK).
Here,filter-rewriter can get primary_seq, and rewrite ack from primary_seq+1
to secondary_seq+1, recalculate checksum. So the secondary tcp connection
kept good.
When we send/recv packet.
client send pkt(seq=client_seq+1+data_len,ack=primary_seq+1,flag=ACK|PSH).
filter-rewriter rewrite ack and send to secondary guest.
primary guest response pkt
(seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK)
secondary guest response pkt
(seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK)
we rewrite secondary guest seq from secondary_seq+1 to primary_seq+1.
So tcp connection kept good.
In code We use offset( = secondary_seq - primary_seq )
to rewrite seq or ack.
handle_primary_tcp_pkt: tcp_pkt->th_ack += offset;
handle_secondary_tcp_pkt: tcp_pkt->th_seq -= offset;
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
|
|
We add TCP,UDP,ICMP packet comparison to replace
IP packet comparison. This can increase the
accuracy of the package comparison.
Less checkpoint more efficiency.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
|
|
If primary packet is same with secondary packet,
we will send primary packet and drop secondary
packet, otherwise notify COLO frame to do checkpoint.
If primary packet comes but secondary packet does not,
after REGULAR_PACKET_CHECK_MS milliseconds we set
the primary packet as old_packet,then do a checkpoint.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
|