riscv-gnu-toolchain/qemu.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-26	linux-user: remove handling of aarch64's EXCP_STREX	Emilio G. Cota	1	-125/+0
	The exception is not emitted anymore. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1467054136-10430-30-git-send-email-cota@braap.org>
2016-10-26	linux-user: remove handling of ARM's EXCP_STREX	Emilio G. Cota	1	-93/+0
	The exception is not emitted anymore. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twidle.net> Message-Id: <1467054136-10430-29-git-send-email-cota@braap.org>
2016-10-26	target-arm: emulate aarch64's LL/SC using cmpxchg helpers	Emilio G. Cota	3	-58/+163
	Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/JVc8Y atomic_add-bench: 1000000 ops/thread, [0,1] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \|\| \| 14 ++ ++ \| \| \| 12 ++\| ++ \| \| \| 10 ++++ ++ 8 ++E ++ \|+++ \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| \| \| 2 +H++E+--- ++ + \| +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \| \| \| 14 ++E ++ \| \| \| 12 ++\| ++ \|+++ \| 10 ++ \| ++ 8 ++ \| ++ \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- \| 2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++ +++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 60 ++master +-H--+ +++ ---+E+-----+E+------+E+ \| +E+------E-------+E+--- \| \| --- +++ \| 50 ++ +++--- ++ \| -+E+ \| 40 ++ +++---- ++ \| E- \| \| --\| \| 30 ++ -- +++ ++ \| +E+ \| 20 ++E+ ++ \|E+ \| \| \| 10 ++ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 160 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 140 ++master +-H--+ +++ +++ \| -+E+-----+E+-------E\| 120 ++ +++ ---- +++ \| +++ ----E-- \| 100 ++ --E--- +++ ++ \| +++ ---- +++ \| 80 ++ --E-- ++ \| ---- +++ \| \| -+E+ \| 60 ++ ---- +++ ++ \| +E+- \| 40 ++ -- ++ \| +E+ \| 20 +EE+ ++ +++ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads [rth: Rearrange 128-bit cmpxchg helper. Enforce alignment on LL.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-arm: emulate SWP with atomic_xchg helper	Emilio G. Cota	1	-12/+14
	Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-25-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-arm: emulate LL/SC using cmpxchg helpers	Emilio G. Cota	1	-95/+45
	Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in ARM with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/aNQpB atomic_add-bench: 1000000 ops/thread, [0,1] range 9 ++---------+----------+----------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 8 +Emaster +-H--+ ++ \| \| \| 7 ++E ++ \| \| \| 6 ++++ ++ \| \| \| 5 ++ \| ++ 4 ++ \| ++ \| \| \| 3 ++ \| ++ \| \| \| 2 ++ \| ++ \|H++E+--- +++ ---+E+------+E+------+E\| 1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++ ++H+ + +++ + +++ ++++ + + + \| 0 ++--H----H-+-----H----+----------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 16 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 14 ++master +-H--+ ++ \| \| \| 12 ++\| ++ \| E \| 10 ++\| ++ \| \| \| 8 ++++ ++ \|E+\| \| \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- +++ +++ +++ ---+E+------+E\| 2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++ + \| + +++ + ++++ + + + \| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + ++++ + \| 60 ++master +-H--+ ----E------+E+-------++ \| -+E+--- +++ +++ +E\| \| +++ ---- +++ ++\| 50 ++ +++ ---+E+- ++ \| -E--- \| 40 ++ ---+++ ++ \| +++--- \| \| -+E+ \| 30 ++ +++---- ++ \| +E+ \| 20 ++ +++-- ++ \| +E+ \| \|+E+ \| 10 +E+ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 120 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| \| master +-H--+ ++\| 100 ++ ----E+ \| +++ ---+E+--- ++\| \| --E--- +++ \| 80 ++ ---- +++ ++ \| ---+E+- \| 60 ++ -+E+-- ++ \| +++ ---- +++ \| \| -+E+- \| 40 ++ +++---- ++ \| +++ ---+E+ \| \| -+E+--- \| 20 ++ +E+ ++ \|+E+++ \| +E+ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads [rth: Enforce alignment for ldrexd.] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-23-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-arm: Rearrange aa32 load and store functions	Richard Henderson	1	-105/+66
	Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr, gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tests: add atomic_add-bench	Emilio G. Cota	3	-1/+167
	With this microbenchmark we can measure the overhead of emulating atomic instructions with a configurable degree of contention. The benchmark spawns $n threads, each performing $o atomic ops (additions) in a loop. Each atomic operation is performed on a different cache line (assuming lines are 64b long) that is randomly selected from a range [0, $r). [ Note: each $foo corresponds to a -foo flag ] Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org>
2016-10-26	target-i386: remove helper_lock()	Emilio G. Cota	3	-50/+0
	It's been superseded by the atomic helpers. The use of the atomic helpers provides a significant performance and scalability improvement. Below is the result of running the atomic_add-test microbenchmark with: $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n , where $n is the number of threads and $r is the allowed range for the additions. The scenarios measured are: - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset) - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper - master: before this patchset Results sorted in ascending range, i.e. descending degree of contention. Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64 Opteron 6376 cores. atomic_add-bench: 5000000 ops/thread, [0,1] range 25 ++---------+----------+---------+----------+----------+----------+---++ + atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 +Emaster +-N--+ ++ \|\| \| \|++ \| \|\| \| 15 +++ ++ \|N\| \| \|+\| \| 10 ++\| ++ \|+\|+ \| \| \| -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E\| \|+E+E+- +++ +E+------+E+-- \| 5 ++\|+ ++ \|+N+H+--- +++ \| ++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- \| 0 ++---------+-----H----+---H-----+----------+----------+----------+---H+ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,2] range 25 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 ++master +-N--+ ++ \|E\| \| \|++ \| \|\|E \| 15 ++\| ++ \|N\|\| \| \|+\|\| ---+E+------+E+-----+E+------+E\| 10 ++\| \| ---+E+------+E+-----+E+--- +++ +++ \|\|H+E+--+E+-- \| \|+++++ \| \| \|\| \| 5 ++\|+H+-- +++ ++ \|+N+ - ---+H+------+H+------ \| + +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H\| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,8] range 40 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| 35 +cmpxchg +-H--+ ++ \| master +-N--+ ---+E+------+E+------+E+-----+E+------+E\| 30 ++\| ---+E+-- +++ ++ \| \| -+E+--- \| 25 ++E ---- +++ ++ \|+++++ -+E+ \| 20 +E+ E-- +++ ++ \|H\|+++ \| \|+\| +H+------- \| 15 ++H+ ---+++ +H+------ ++ \|N++H+-- +++--- +H+------++\| 10 ++ +++ - +++ ---+H+ +++ +H+ \| \| +H+-----+H+------+H+-- \| 5 ++\| +++ ++ ++N+N+--+N++ + + + + + \| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,128] range 160 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 140 +cmpxchg +-H--+ +++ +++ ++ \| master +-N--+ E--------E------+E+------++\| 120 ++ --\| \| +++ E+ \| -- +++ +++ ++\| 100 ++ - ++ \| +++- +++ ++\| 80 ++ -+E+ -+H+------+H+------H--------++ \| ---- ---- +++ H\| \| ---+E+-----+E+- ---+H+ ++\| 60 ++ +E+--- +++ ---+H+--- ++ \| --+++ ---+H+-- \| 40 ++ +E+-+H+--- ++ \| +H+ \| 20 +EE+ ++ +N+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,1024] range 350 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 300 +cmpxchg +-H--+ +++ \| master +-N--+ +++ \|\| \| +++ \| ----E\| 250 ++ \| ----E---- ++ \| ----E--- \| ---+H\| 200 ++ -+E+--- +++ ---+H+--- ++ \| ---- -+H+-- \| \| +E+ +++ ---- +++ \| 150 ++ ---+++ ---+H+- ++ \| --- -+H+-- \| 100 ++ ---+E+ ---- +++ ++ \| +++ ---+E+-----+H+- \| \| -+E+------+H+-- \| 50 ++ +E+ ++ +EE+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads hi-res: http://imgur.com/a/fMRmq For master I stopped measuring master after 8 threads, because there is little point in measuring the well-known performance collapse of a contended lock. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate XCHG using atomic helper	Emilio G. Cota	1	-6/+2
	Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-19-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed BTX ops using atomic helpers	Emilio G. Cota	1	-30/+57
	[rth: Avoid redundant qemu_ld in locked case. Fix previously unnoticed incorrect zero-extension of address in register-offset case.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-18-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed XADD using atomic helper	Emilio G. Cota	1	-5/+10
	[rth: Move load of reg value to common location.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-17-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed NEG using cmpxchg helper	Emilio G. Cota	1	-4/+34
	[rth: Move redundant qemu_load out of cmpxchg loop.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-16-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed NOT using atomic helper	Emilio G. Cota	1	-6/+20
	[rth: Avoid qemu_load that's redundant with the atomic op.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-15-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed INC using atomic helper	Emilio G. Cota	1	-11/+13
	[rth: Merge gen_inc_locked back into gen_inc to share cc update.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-14-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed OP instructions using atomic helpers	Emilio G. Cota	1	-18/+58
	[rth: Eliminate some unnecessary temporaries.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-13-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers	Emilio G. Cota	3	-66/+169
	The diff here is uglier than necessary. All this does is to turn FOO into: if (s->prefix & PREFIX_LOCK) { BAR } else { FOO } where FOO is the original implementation of an unlocked cmpxchg. [rth: Adjust unlocked cmpxchg to use movcond instead of branches. Adjust helpers to use atomic helpers.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tcg: Emit barriers with parallel_cpus	Richard Henderson	1	-11/+1
	Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tcg: Add CONFIG_ATOMIC64	Richard Henderson	6	-13/+114
	Allow qemu to build on 32-bit hosts without 64-bit atomic ops. Even if we only allow 32-bit hosts to multi-thread emulate 32-bit guests, we still need some way to handle the 32-bit guest using a 64-bit atomic operation. Do so by dropping back to single-step. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tcg: Add atomic128 helpers	Richard Henderson	6	-3/+119
	Force the use of cmpxchg16b on x86_64. Wikipedia suggests that only very old AMD64 (circa 2004) did not have this instruction. Further, it's required by Windows 8 so no new cpus will ever omit it. If we truely care about these, then we could check this at startup time and then avoid executing paths that use it. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tcg: Add atomic helpers	Richard Henderson	9	-15/+826
	Add all of cmpxchg, op_fetch, fetch_op, and xchg. Handle both endian-ness, and sizes up to 8. Handle expanding non-atomically, when emulating in serial. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	cputlb: Tidy some macros	Richard Henderson	2	-22/+8
	TGT_LE and TGT_BE are not size dependent and do not need to be redefined. The others are no longer used at all. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	cputlb: Move most of iotlb code out of line	Richard Henderson	2	-42/+47
	Saves 2k code size off of a cold path. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	cputlb: Remove includes from softmmu_template.h	Richard Henderson	1	-4/+0
	We already include exec/address-spaces.h and exec/memory.h in cputlb.c; the include of qemu/timer.h appears to be a fossil. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	cputlb: Move probe_write out of softmmu_template.h	Richard Henderson	2	-23/+21
	Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	cputlb: Replace SHIFT with DATA_SIZE	Richard Henderson	2	-13/+10
	Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	linux-user: enable parallel code generation on clone	Alex Bennée	1	-0/+8
	The variable parallel_cpus controls the generation of thread aware atomic code. We only need to set it once we clone our first thread. At this point any existing translations need to be thrown away. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	tcg: Add EXCP_ATOMIC	Richard Henderson	9	-0/+88
	When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	int128: Add int128_make128	Richard Henderson	1	-5/+15
	Allows Int128 to be used more generally, rather than having to begin with 64-bit inputs and accumulate. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	int128: Use __int128 if available	Richard Henderson	2	-12/+145
	Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	exec: Avoid direct references to Int128 parts	Richard Henderson	2	-2/+12
	Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	atomics: Add __nocheck atomic operations	Richard Henderson	1	-9/+27
	While the check against sizeof(void *) is appropriate for normal usage within qemu, there are places in which we want wider operaions and have checked for their existance. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	atomics: add atomic_op_fetch variants	Emilio G. Cota	1	-0/+17
	This paves the way for upcoming work. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1467054136-10430-9-git-send-email-cota@braap.org>
2016-10-26	atomics: add atomic_xor	Emilio G. Cota	1	-0/+4
	This paves the way for upcoming work. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1467054136-10430-8-git-send-email-cota@braap.org>
2016-10-26	atomics: Add parameters to macros	Richard Henderson	1	-5/+5
	Making these functional rather than object macros will prevent later problems with complex macro expansion. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	audio: intel-hda: check stream entry count during transfer	Prasad J Pandit	1	-1/+2
	Intel HDA emulator uses stream of buffers during DMA data transfers. Each entry has buffer length and buffer pointer position, which are used to derive bytes to 'copy'. If this length and buffer pointer were to be same, 'copy' could be set to zero(0), leading to an infinite loop. Add check to avoid it. Reported-by: Huawei PSIRT <psirt@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1476949224-6865-1-git-send-email-ppandit@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-10-26	colo-proxy: fix memory leak	Zhang Chen	3	-31/+21
	Fix memory leak in colo-compare.c and filter-rewriter.c Report by Coverity and add some comments. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: rtl8139: limit processing of ring descriptors	Prasad J Pandit	1	-1/+1
	RTL8139 ethernet controller in C+ mode supports multiple descriptor rings, each with maximum of 64 descriptors. While processing transmit descriptor ring in 'rtl8139_cplus_transmit', it does not limit the descriptor count and runs forever. Add check to avoid it. Reported-by: Andrew Henderson <hendersa@icculus.org> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: vmxnet: initialise local tx descriptor	Li Qiang	1	-0/+1
	In Vmxnet3 device emulator while processing transmit(tx) queue, when it reaches end of packet, it calls vmxnet3_complete_packet. In that local 'txcq_descr' object is not initialised, which could leak host memory bytes a guest. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	e1000e: Don't zero out buffer address in rx descriptor	Kevin Wolf	1	-4/+4
	The e1000e emulation zeroes out any used rx descriptor and then writes a completely newly constructed value there. By doing this, it doesn't only update the write-back area of the descriptors (as it's supposed to do), but it also clears the buffer address, which real hardware doesn't do. The spec explicitly mentions in chapter 7.1.8 that it is valid for a driver to reuse a descriptor and only update the status field while doing so, i.e. reusing the old buffer address: If software statically allocates buffers, and uses memory read to check for completed descriptors, it simply has to zero the status byte in the descriptor to make it ready for reuse by hardware. This patch fixes the behaviour to leave the buffer address in descriptors unchanged even after the descriptor has been used. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: rocker: set limit to DMA buffer size	Prasad J Pandit	1	-1/+1
	Rocker network switch emulator has test registers to help debug DMA operations. While testing host DMA access, a buffer address is written to register 'TEST_DMA_ADDR' and its size is written to register 'TEST_DMA_SIZE'. When performing TEST_DMA_CTRL_INVERT test, if DMA buffer size was greater than 'INT_MAX', it leads to an invalid buffer access. Limit the DMA buffer size to avoid it. Reported-by: Huawei PSIRT <psirt@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: eepro100: fix memory leak in device uninit	Li Qiang	1	-0/+1
	The exit dispatch of eepro100 network card device doesn't free the 's->vmstate' field which was allocated in device realize thus leading a host memory leak. This patch avoid this. Signed-off-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	tap-bsd: OpenBSD uses tap(4) now	Brad Smith	1	-1/+5
	Update the tap-bsd code now that OpenBSD uses tap(4). Signed-off-by: Brad Smith <brad@comstyle.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: pcnet: fix source formatting and indentation	Prasad J Pandit	1	-63/+67
	Fix indentations and source format at few places. Add braces around 'if' and 'while' statements. Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-26	net: pcnet: check rx/tx descriptor ring length	Prasad J Pandit	1	-0/+3
	The AMD PC-Net II emulator has set of control and status(CSR) registers. Of these, CSR76 and CSR78 hold receive and transmit descriptor ring length respectively. This ring length could range from 1 to 65535. Setting ring length to zero leads to an infinite loop in pcnet_rdra_addr() or pcnet_transmit(). Add check to avoid it. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-10-25	target-m68k: Optimize gen_flush_flags	Richard Henderson	1	-4/+52
	Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-25	target-m68k: Optimize some comparisons	Richard Henderson	1	-6/+103
	Signed-off-by: Richard Henderson <rth@twiddle.net> [laurent: fixed VC and VS: assign v1, not v2] Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-25	target-m68k: Use setcond for scc	Richard Henderson	1	-9/+11
	Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-25	target-m68k: Introduce DisasCompare	Richard Henderson	1	-24/+61
	Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-25	target-m68k: Reorg flags handling	Richard Henderson	7	-497/+359
	Separate all ccr bits. Continue to batch updates via cc_op. Signed-off-by: Richard Henderson <rth@twiddle.net> Fix gen_logic_cc() to really extend the size of the result. Fix gen_get_ccr(): update cc_op as it is used by the helper. Factorize flags computing and src/ccr cleanup Signed-off-by: Laurent Vivier <laurent@vivier.eu> target-m68k: sr/ccr cleanup Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-10-25	target-m68k: Remove incorrect clearing of cc_x	Richard Henderson	1	-7/+0
	The CF docs certainly doesnt suggest this is true. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Laurent Vivier <laurent@vivier.eu>