diff options
author | Peter Maydell <peter.maydell@linaro.org> | 2018-03-16 11:05:03 +0000 |
---|---|---|
committer | Peter Maydell <peter.maydell@linaro.org> | 2018-03-16 11:05:03 +0000 |
commit | 3788c7b6e56fa34ee2a73e41706eb2a2447ba75a (patch) | |
tree | 8f016e7c9175686b4d7c2d1847c8cc877102dc6b /docs/devel | |
parent | a57946ff2acb9c0d95c9f127914540586b0b8c21 (diff) | |
parent | 0790f86861079b1932679d0f011e431aaf4ee9e2 (diff) | |
download | qemu-3788c7b6e56fa34ee2a73e41706eb2a2447ba75a.zip qemu-3788c7b6e56fa34ee2a73e41706eb2a2447ba75a.tar.gz qemu-3788c7b6e56fa34ee2a73e41706eb2a2447ba75a.tar.bz2 |
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Record-replay lockstep execution, log dumper and fixes (Alex, Pavel)
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
# gpg: Signature made Mon 12 Mar 2018 16:10:52 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (69 commits)
tcg: fix cpu_io_recompile
replay: update documentation
replay: save vmstate of the asynchronous events
replay: don't process async events when warping the clock
scripts/replay-dump.py: replay log dumper
replay: avoid recursive call of checkpoints
replay: check return values of fwrite
replay: push replay_mutex_lock up the call tree
replay: don't destroy mutex at exit
replay: make locking visible outside replay code
replay/replay-internal.c: track holding of replay_lock
replay/replay.c: bump REPLAY_VERSION again
replay: save prior value of the host clock
replay: added replay log format description
replay: fix save/load vm for non-empty queue
replay: fixed replay_enable_events
replay: fix processing async events
cpu-exec: fix exception_index handling
hw/i386/pc: Factor out the superio code
hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# default-configs/i386-softmmu.mak
# default-configs/x86_64-softmmu.mak
Diffstat (limited to 'docs/devel')
-rw-r--r-- | docs/devel/atomics.txt | 57 |
1 files changed, 30 insertions, 27 deletions
diff --git a/docs/devel/atomics.txt b/docs/devel/atomics.txt index 10c5fa3..a4db3a4 100644 --- a/docs/devel/atomics.txt +++ b/docs/devel/atomics.txt @@ -122,20 +122,30 @@ In general, if the algorithm you are writing includes both writes and reads on the same side, it is generally simpler to use sequentially consistent primitives. -When using this model, variables are accessed with atomic_read() and -atomic_set(), and restrictions to the ordering of accesses is enforced +When using this model, variables are accessed with: + +- atomic_read() and atomic_set(); these prevent the compiler from + optimizing accesses out of existence and creating unsolicited + accesses, but do not otherwise impose any ordering on loads and + stores: both the compiler and the processor are free to reorder + them. + +- atomic_load_acquire(), which guarantees the LOAD to appear to + happen, with respect to the other components of the system, + before all the LOAD or STORE operations specified afterwards. + Operations coming before atomic_load_acquire() can still be + reordered after it. + +- atomic_store_release(), which guarantees the STORE to appear to + happen, with respect to the other components of the system, + after all the LOAD or STORE operations specified afterwards. + Operations coming after atomic_store_release() can still be + reordered after it. + +Restrictions to the ordering of accesses can also be specified using the memory barrier macros: smp_rmb(), smp_wmb(), smp_mb(), smp_mb_acquire(), smp_mb_release(), smp_read_barrier_depends(). -atomic_read() and atomic_set() prevents the compiler from using -optimizations that might otherwise optimize accesses out of existence -on the one hand, or that might create unsolicited accesses on the other. -In general this should not have any effect, because the same compiler -barriers are already implied by memory barriers. However, it is useful -to do so, because it tells readers which variables are shared with -other threads, and which are local to the current thread or protected -by other, more mundane means. - Memory barriers control the order of references to shared memory. They come in six kinds: @@ -232,7 +242,7 @@ make atomic_mb_set() the more expensive operation. There are two common cases in which atomic_mb_read and atomic_mb_set generate too many memory barriers, and thus it can be useful to manually -place barriers instead: +place barriers, or use atomic_load_acquire/atomic_store_release instead: - when a data structure has one thread that is always a writer and one thread that is always a reader, manual placement of @@ -243,18 +253,15 @@ place barriers instead: thread 1 thread 1 ------------------------- ------------------------ (other writes) - smp_mb_release() - atomic_mb_set(&a, x) atomic_set(&a, x) - smp_wmb() - atomic_mb_set(&b, y) atomic_set(&b, y) + atomic_mb_set(&a, x) atomic_store_release(&a, x) + atomic_mb_set(&b, y) atomic_store_release(&b, y) => thread 2 thread 2 ------------------------- ------------------------ - y = atomic_mb_read(&b) y = atomic_read(&b) - smp_rmb() - x = atomic_mb_read(&a) x = atomic_read(&a) - smp_mb_acquire() + y = atomic_mb_read(&b) y = atomic_load_acquire(&b) + x = atomic_mb_read(&a) x = atomic_load_acquire(&a) + (other reads) Note that the barrier between the stores in thread 1, and between the loads in thread 2, has been optimized here to a write or a @@ -276,7 +283,6 @@ place barriers instead: smp_mb_acquire(); Similarly, atomic_mb_set() can be transformed as follows: - smp_mb(): smp_mb_release(); for (i = 0; i < 10; i++) => for (i = 0; i < 10; i++) @@ -284,6 +290,8 @@ place barriers instead: smp_mb(); + The other thread can still use atomic_mb_read()/atomic_mb_set(). + The two tricks can be combined. In this case, splitting a loop in two lets you hoist the barriers out of the loops _and_ eliminate the expensive smp_mb(): @@ -296,8 +304,6 @@ expensive smp_mb(): atomic_set(&a[i], false); smp_mb(); - The other thread can still use atomic_mb_read()/atomic_mb_set() - Memory barrier pairing ---------------------- @@ -386,10 +392,7 @@ and memory barriers, and the equivalents in QEMU: note that smp_store_mb() is a little weaker than atomic_mb_set(). atomic_mb_read() compiles to the same instructions as Linux's smp_load_acquire(), but this should be treated as an implementation - detail. QEMU does have atomic_load_acquire() and atomic_store_release() - macros, but for now they are only used within atomic.h. This may - change in the future. - + detail. SOURCES ======= |