Age | Commit message (Collapse) | Author | Files | Lines |
|
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Commit 076003f5 added configuration for NFS with IMGOPTSSYNTAX enabled,
but it didn't use the right variable name: $TEST_DIR_OPTS doesn't exist.
This fixes the mistake.
However, this doesn't make anything work that was broken before: The
only way to get IMGOPTSSYNTAX is with -luks, but the combination of
-luks and -nfs doesn't get qemu-img create commands right (because
qemu-img create doesn't support --image-opts yet), so even after this
fix some more work would be required to make the tests pass.
Reported-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
It is unused now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
It's the simpler interface to use for the raw format driver.
Apart from that, this removes the last user of the AIO emulation
implemented by bdrv_aio_ioctl().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
This allows drivers to implement ioctls in a coroutine-based way.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
It is unused now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Instead of letting raw-posix use the bdrv_ioctl() abstraction to issue
an ioctl to itself, just call ioctl() directly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch exports a bdrv_co_ioctl() function and uses it to extend this
mode of operation to ioctls.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
It is unused now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch extends this mode of operation to discards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch extends this mode of operation to flushes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
staging
cmpxchg emulation of atomics, v8
# gpg: Signature made Wed 26 Oct 2016 16:30:03 BST
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-atomic-20161026: (37 commits)
target-alpha: Emulate LL/SC using cmpxchg helpers
target-alpha: Introduce MMU_PHYS_IDX
target-arm: remove EXCP_STREX + cpu_exclusive_{test, info}
linux-user: remove handling of aarch64's EXCP_STREX
linux-user: remove handling of ARM's EXCP_STREX
target-arm: emulate aarch64's LL/SC using cmpxchg helpers
target-arm: emulate SWP with atomic_xchg helper
target-arm: emulate LL/SC using cmpxchg helpers
target-arm: Rearrange aa32 load and store functions
tests: add atomic_add-bench
target-i386: remove helper_lock()
target-i386: emulate XCHG using atomic helper
target-i386: emulate LOCK'ed BTX ops using atomic helpers
target-i386: emulate LOCK'ed XADD using atomic helper
target-i386: emulate LOCK'ed NEG using cmpxchg helper
target-i386: emulate LOCK'ed NOT using atomic helper
target-i386: emulate LOCK'ed INC using atomic helper
target-i386: emulate LOCK'ed OP instructions using atomic helpers
target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers
tcg: Emit barriers with parallel_cpus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
staging
# gpg: Signature made Wed 26 Oct 2016 03:19:06 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
colo-proxy: fix memory leak
net: rtl8139: limit processing of ring descriptors
net: vmxnet: initialise local tx descriptor
e1000e: Don't zero out buffer address in rx descriptor
net: rocker: set limit to DMA buffer size
net: eepro100: fix memory leak in device uninit
tap-bsd: OpenBSD uses tap(4) now
net: pcnet: fix source formatting and indentation
net: pcnet: check rx/tx descriptor ring length
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
into staging
# gpg: Signature made Tue 25 Oct 2016 19:58:46 BST
# gpg: using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier/tags/m68k-part1-pull-request: (23 commits)
target-m68k: Optimize gen_flush_flags
target-m68k: Optimize some comparisons
target-m68k: Use setcond for scc
target-m68k: Introduce DisasCompare
target-m68k: Reorg flags handling
target-m68k: Remove incorrect clearing of cc_x
target-m68k: Some fixes to SR and flags management
target-m68k: Print flags properly
target-m68k: update CPU flags management
target-m68k: don't update cc_dest in helpers
target-m68k: update move to/from ccr/sr
target-m68k: remove m68k_cpu_exec_enter() and m68k_cpu_exec_exit()
target-m68k: Replace helper_xflag_lt with setcond
target-m68k: allow to update flags with operation on words and bytes
target-m68k: REG() macro cleanup
target-m68k: set PAGE_BITS to 12 for m68k
target-m68k: define operand sizes
target-m68k: set disassembler mode to 680x0 or coldfire
target-m68k: introduce read_imXX() functions
target-m68k: manage scaled index
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. However, portable parallel
code is written assuming only cmpxchg which means that in
practice this is a viable alternative.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Rather than using helpers for physical accesses, use a mmu index.
The primary cleanup is with store-conditional on physical addresses.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The exception is not emitted anymore; remove it and the associated
TCG variables.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-31-git-send-email-cota@braap.org>
|
|
The exception is not emitted anymore.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-30-git-send-email-cota@braap.org>
|
|
The exception is not emitted anymore.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twidle.net>
Message-Id: <1467054136-10430-29-git-send-email-cota@braap.org>
|
|
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.
The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.
Hi-res plots: http://imgur.com/a/JVc8Y
atomic_add-bench: 1000000 ops/thread, [0,1] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
|| |
14 ++ ++
| | |
12 ++| ++
| | |
10 ++++ ++
8 ++E ++
|+++ |
6 ++ | ++
| | |
4 ++ | ++
| | |
2 +H++E+--- ++
+ | +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,2] range
18 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
16 ++master +-H--+ ++
| | |
14 ++E ++
| | |
12 ++| ++
|+++ |
10 ++ | ++
8 ++ | ++
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- |
2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++
+++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E|
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,128] range
70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
60 ++master +-H--+ +++ ---+E+-----+E+------+E+
| +E+------E-------+E+--- |
| --- +++ |
50 ++ +++--- ++
| -+E+ |
40 ++ +++---- ++
| E- |
| --| |
30 ++ -- +++ ++
| +E+ |
20 ++E+ ++
|E+ |
| |
10 ++ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,1024] range
160 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
140 ++master +-H--+ +++ +++
| -+E+-----+E+-------E|
120 ++ +++ ---- +++
| +++ ----E-- |
100 ++ --E--- +++ ++
| +++ ---- +++ |
80 ++ --E-- ++
| ---- +++ |
| -+E+ |
60 ++ ---- +++ ++
| +E+- |
40 ++ -- ++
| +E+ |
20 +EE+ ++
+++ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
[rth: Rearrange 128-bit cmpxchg helper. Enforce alignment on LL.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-25-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.
The appended emulates LL/SC pairs in ARM with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.
Hi-res plots: http://imgur.com/a/aNQpB
atomic_add-bench: 1000000 ops/thread, [0,1] range
9 ++---------+----------+----------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
8 +Emaster +-H--+ ++
| | |
7 ++E ++
| | |
6 ++++ ++
| | |
5 ++ | ++
4 ++ | ++
| | |
3 ++ | ++
| | |
2 ++ | ++
|H++E+--- +++ ---+E+------+E+------+E|
1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++
++H+ + +++ + +++ ++++ + + + |
0 ++--H----H-+-----H----+----------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,2] range
16 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
14 ++master +-H--+ ++
| | |
12 ++| ++
| E |
10 ++| ++
| | |
8 ++++ ++
|E+| |
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- +++ +++ +++ ---+E+------+E|
2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++
+ | + +++ + ++++ + + + |
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,128] range
70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + ++++ + |
60 ++master +-H--+ ----E------+E+-------++
| -+E+--- +++ +++ +E|
| +++ ---- +++ ++|
50 ++ +++ ---+E+- ++
| -E--- |
40 ++ ---+++ ++
| +++--- |
| -+E+ |
30 ++ +++---- ++
| +E+ |
20 ++ +++-- ++
| +E+ |
|+E+ |
10 +E+ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 1000000 ops/thread, [0,1024] range
120 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
| master +-H--+ ++|
100 ++ ----E+
| +++ ---+E+--- ++|
| --E--- +++ |
80 ++ ---- +++ ++
| ---+E+- |
60 ++ -+E+-- ++
| +++ ---- +++ |
| -+E+- |
40 ++ +++---- ++
| +++ ---+E+ |
| -+E+--- |
20 ++ +E+ ++
|+E+++ |
+E+ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
[rth: Enforce alignment for ldrexd.]
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-23-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr,
gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
With this microbenchmark we can measure the overhead of emulating atomic
instructions with a configurable degree of contention.
The benchmark spawns $n threads, each performing $o atomic ops (additions)
in a loop. Each atomic operation is performed on a different cache line
(assuming lines are 64b long) that is randomly selected from a range [0, $r).
[ Note: each $foo corresponds to a -foo flag ]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org>
|
|
It's been superseded by the atomic helpers.
The use of the atomic helpers provides a significant performance and scalability
improvement. Below is the result of running the atomic_add-test microbenchmark with:
$ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n
, where $n is the number of threads and $r is the allowed range for the additions.
The scenarios measured are:
- atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset)
- cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper
- master: before this patchset
Results sorted in ascending range, i.e. descending degree of contention.
Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64
Opteron 6376 cores.
atomic_add-bench: 5000000 ops/thread, [0,1] range
25 ++---------+----------+---------+----------+----------+----------+---++
+ atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 +Emaster +-N--+ ++
|| |
|++ |
|| |
15 +++ ++
|N| |
|+| |
10 ++| ++
|+|+ |
| | -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E|
|+E+E+- +++ +E+------+E+-- |
5 ++|+ ++
|+N+H+--- +++ |
++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- |
0 ++---------+-----H----+---H-----+----------+----------+----------+---H+
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,2] range
25 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 ++master +-N--+ ++
|E| |
|++ |
||E |
15 ++| ++
|N|| |
|+|| ---+E+------+E+-----+E+------+E|
10 ++| | ---+E+------+E+-----+E+--- +++ +++
||H+E+--+E+-- |
|+++++ |
| || |
5 ++|+H+-- +++ ++
|+N+ - ---+H+------+H+------ |
+ +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H|
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,8] range
40 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
35 +cmpxchg +-H--+ ++
| master +-N--+ ---+E+------+E+------+E+-----+E+------+E|
30 ++| ---+E+-- +++ ++
| | -+E+--- |
25 ++E ---- +++ ++
|+++++ -+E+ |
20 +E+ E-- +++ ++
|H|+++ |
|+| +H+------- |
15 ++H+ ---+++ +H+------ ++
|N++H+-- +++--- +H+------++|
10 ++ +++ - +++ ---+H+ +++ +H+
| | +H+-----+H+------+H+-- |
5 ++| +++ ++
++N+N+--+N++ + + + + + |
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,128] range
160 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
140 +cmpxchg +-H--+ +++ +++ ++
| master +-N--+ E--------E------+E+------++|
120 ++ --| | +++ E+
| -- +++ +++ ++|
100 ++ - ++
| +++- +++ ++|
80 ++ -+E+ -+H+------+H+------H--------++
| ---- ---- +++ H|
| ---+E+-----+E+- ---+H+ ++|
60 ++ +E+--- +++ ---+H+--- ++
| --+++ ---+H+-- |
40 ++ +E+-+H+--- ++
| +H+ |
20 +EE+ ++
+N+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
atomic_add-bench: 5000000 ops/thread, [0,1024] range
350 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
300 +cmpxchg +-H--+ +++
| master +-N--+ +++ ||
| +++ | ----E|
250 ++ | ----E---- ++
| ----E--- | ---+H|
200 ++ -+E+--- +++ ---+H+--- ++
| ---- -+H+-- |
| +E+ +++ ---- +++ |
150 ++ ---+++ ---+H+- ++
| --- -+H+-- |
100 ++ ---+E+ ---- +++ ++
| +++ ---+E+-----+H+- |
| -+E+------+H+-- |
50 ++ +E+ ++
+EE+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads
hi-res: http://imgur.com/a/fMRmq
For master I stopped measuring master after 8 threads, because there is little
point in measuring the well-known performance collapse of a contended lock.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-19-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Avoid redundant qemu_ld in locked case. Fix previously unnoticed
incorrect zero-extension of address in register-offset case.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-18-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Move load of reg value to common location.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-17-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Move redundant qemu_load out of cmpxchg loop.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-16-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Avoid qemu_load that's redundant with the atomic op.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Merge gen_inc_locked back into gen_inc to share cc update.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-14-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
[rth: Eliminate some unnecessary temporaries.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-13-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The diff here is uglier than necessary. All this does is to turn
FOO
into:
if (s->prefix & PREFIX_LOCK) {
BAR
} else {
FOO
}
where FOO is the original implementation of an unlocked cmpxchg.
[rth: Adjust unlocked cmpxchg to use movcond instead of branches.
Adjust helpers to use atomic helpers.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Allow qemu to build on 32-bit hosts without 64-bit atomic ops.
Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
guests, we still need some way to handle the 32-bit guest using a
64-bit atomic operation. Do so by dropping back to single-step.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Force the use of cmpxchg16b on x86_64.
Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction. Further, it's required by Windows 8 so no new cpus
will ever omit it.
If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
TGT_LE and TGT_BE are not size dependent and do not need to be
redefined. The others are no longer used at all.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Saves 2k code size off of a cold path.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
We already include exec/address-spaces.h and exec/memory.h in
cputlb.c; the include of qemu/timer.h appears to be a fossil.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The variable parallel_cpus controls the generation of thread aware
atomic code. We only need to set it once we clone our first thread.
At this point any existing translations need to be thrown away.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Allows Int128 to be used more generally, rather than having to
begin with 64-bit inputs and accumulate.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
While the check against sizeof(void *) is appropriate for
normal usage within qemu, there are places in which we want
wider operaions and have checked for their existance.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This paves the way for upcoming work.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-9-git-send-email-cota@braap.org>
|
|
This paves the way for upcoming work.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-8-git-send-email-cota@braap.org>
|