aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-06-15tcg: Reduce max TB opcode countRichard Henderson10-13/+26
Also, assert that we don't overflow any of two different offsets into the TB. Both unwind and goto_tb both record a uint16_t for later use. This fixes an arm-softmmu test case utilizing NEON in which there is a TB generated that runs to 7800 opcodes, and compiles to 96k on an x86_64 host. This overflows the 16-bit offset in which we record the goto_tb reset offset. Because of that overflow, we install a jump destination that goes to neverland. Boom. With this reduced op count, the same TB compiles to about 48k for aarch64, ppc64le, and x86_64 hosts, and neither assertion fires. Cc: qemu-stable@nongnu.org Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15tcg: remove tb_lockEmilio G. Cota11-152/+75
Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+------------*--+-+ | + + + + + *B | | before ***B*** ** * | |tb lock removal ###D### *** | 600 +-+ *** +-+ | ** # | | *B* #D | | *** * ## | 500 +-+ *** ### +-+ | * *** ### | | *B* # ## | | ** * #D# | 400 +-+ ** ## +-+ | ** ### | | ** ## | | ** # ## | 300 +-+ * B* #D# +-+ | B *** ### | | * ** #### | | * *** ### | 200 +-+ B *B #D# +-+ | #B* * ## # | | #* ## | | + D##D# + + + + | 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ | + + + + + + | | before ***B*** B | 80 +tb lock removal ###D### **D +-+ | **### | | **## | 70 +-+ ** # +-+ | ** ## | | ** # | 60 +-+ *B ## +-+ | ** ## | | *** #D | 50 +-+ *** ## +-+ | * ** ### | | **B* ### | 40 +-+ **** # ## +-+ | **** #D# | | ***B** ### | 30 +-+ B***B** #### +-+ | B * * # ### | | B ###D# | 20 +-+ D ##D## +-+ | D# | | + + + + + + | 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: remove tb_lock mention from cpu_restore_state_from_tbEmilio G. Cota1-1/+0
tb_lock was needed when the function did retranslation. However, since fca8a500d519 ("tcg: Save insn data and use it in cpu_restore_state_from_tb") we don't do retranslation. Get rid of the comment. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15cputlb: remove tb_lock from tlb_flush functionsEmilio G. Cota1-8/+0
The acquisition of tb_lock was added when the async tlb_flush was introduced in e3b9ca810 ("cputlb: introduce tlb_flush_* async work.") tb_lock was there to allow us to do memset() on the tb_jmp_cache's. However, since f3ced3c5928 ("tcg: consistently access cpu->tb_jmp_cache atomically") all accesses to tb_jmp_cache are atomic, so tb_lock is not needed here. Get rid of it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: protect TB jumps with a per-destination-TB lockEmilio G. Cota4-76/+124
This applies to both user-mode and !user-mode emulation. Instead of relying on a global lock, protect the list of incoming jumps with tb->jmp_lock. This lock also protects tb->cflags, so update all tb->cflags readers outside tb->jmp_lock to use atomic reads via tb_cflags(). In order to find the destination TB (and therefore its jmp_lock) from the origin TB, we introduce tb->jmp_dest[]. I considered not using a linked list of jumps, which simplifies code and makes the struct smaller. However, it unnecessarily increases memory usage, which results in a performance decrease. See for instance these numbers booting+shutting down debian-arm: Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%) ------------------------------------------------------------------------------ before 20.88 0.74 0.154512 0. after 20.81 0.38 0.079078 -0.33524904 GTree 21.02 0.28 0.058856 0.67049808 GHashTable + xxhash 21.63 1.08 0.233604 3.5919540 Using a hash table or a binary tree to keep track of the jumps doesn't really pay off, not only due to the increased memory usage, but also because most TBs have only 0 or 1 jumps to them. The maximum number of jumps when booting debian-arm that I measured is 35, but as we can see in the histogram below a TB with that many incoming jumps is extremely rare; the average TB has 0.80 incoming jumps. n_jumps: 379208; avg jumps/tb: 0.801099 dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁|[34.0,35.0] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: discard TB when tb_link_page returns an existing matching TBEmilio G. Cota3-21/+46
Use the recently-gained QHT feature of returning the matching TB if it already exists. This allows us to get rid of the lookup we perform right after acquiring tb_lock. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: introduce assert_no_pages_lockedEmilio G. Cota3-0/+16
The appended adds assertions to make sure we do not longjmp with page locks held. Note that user-mode has nothing to check, since page_locks are !user-mode only. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: add page_locked assertionsEmilio G. Cota1-3/+79
This is only compiled under CONFIG_DEBUG_TCG to avoid bloating the binary. In user-mode, assert_page_locked is equivalent to assert_mmap_lock. Note: There are some tb_lock assertions left that will be removed by later patches. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: use per-page locking in !user-modeEmilio G. Cota3-41/+409
Groundwork for supporting parallel TCG generation. Instead of using a global lock (tb_lock) to protect changes to pages, use fine-grained, per-page locks in !user-mode. User-mode stays with mmap_lock. Sometimes changes need to happen atomically on more than one page (e.g. when a TB that spans across two pages is added/invalidated, or when a range of pages is invalidated). We therefore introduce struct page_collection, which helps us keep track of a set of pages that have been locked in the appropriate locking order (i.e. by ascending page index). This commit first introduces the structs and the function helpers, to then convert the calling code to use per-page locking. Note that tb_lock is not removed yet. While at it, rename tb_alloc_page to tb_page_add, which pairs with tb_page_remove. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: move tb_invalidate_phys_page_range up in the fileEmilio G. Cota1-38/+39
This greatly simplifies next commit's diff. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: work page-by-page in tb_invalidate_phys_range_1Emilio G. Cota1-4/+8
So that we pass a same-page range to tb_invalidate_phys_page_range, instead of always passing an end address that could be on a different page. As discussed with Peter Maydell on the list [1], tb_invalidate_phys_page_range doesn't actually do much with 'end', which explains why we have never hit a bug despite going against what the comment on top of tb_invalidate_phys_page_range requires: > * Invalidate all TBs which intersect with the target physical address range > * [start;end[. NOTE: start and end must refer to the *same* physical page. The appended honours the comment, which avoids confusion. While at it, rework the loop into a for loop, which is less error prone (e.g. "continue" won't result in an infinite loop). [1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg09165.html Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: remove hole in PageDescEmilio G. Cota1-1/+1
Groundwork for supporting parallel TCG generation. Move the hole to the end of the struct, so that a u32 field can be added there without bloating the struct. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: make l1_map locklessEmilio G. Cota2-12/+16
Groundwork for supporting parallel TCG generation. We never remove entries from the radix tree, so we can use cmpxchg to implement lockless insertions. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TBEmilio G. Cota2-34/+30
This commit does several things, but to avoid churn I merged them all into the same commit. To wit: - Use uintptr_t instead of TranslationBlock * for the list of TBs in a page. Just like we did in (c37e6d7e "tcg: Use uintptr_t type for jmp_list_{next|first} fields of TB"), the rationale is the same: these are tagged pointers, not pointers. So use a more appropriate type. - Only check the least significant bit of the tagged pointers. Masking with 3/~3 is unnecessary and confusing. - Introduce the TB_FOR_EACH_TAGGED macro, and use it to define PAGE_FOR_EACH_TB, which improves readability. Note that TB_FOR_EACH_TAGGED will gain another user in a subsequent patch. - Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there is a bug and we attempt to remove a TB that is not in the list, instead of segfaulting (since the list is NULL-terminated) we will reach g_assert_not_reached(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctxEmilio G. Cota4-3/+20
Thereby making it per-TCGContext. Once we remove tb_lock, this will avoid an atomic increment every time a TB is invalidated. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15tcg: track TBs with per-region BST'sEmilio G. Cota6-89/+213
This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15qht: return existing entry when qht_insert failsEmilio G. Cota5-16/+32
The meaning of "existing" is now changed to "matches in hash and ht->cmp result". This is saner than just checking the pointer value. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15qht: require a default comparison functionEmilio G. Cota6-23/+65
qht_lookup now uses the default cmp function. qht_lookup_custom is defined to retain the old behaviour, that is a cmp function is explicitly provided. qht_insert will gain use of the default cmp in the next patch. Note that we move qht_lookup_custom's @func to be the last argument, which makes the new qht_lookup as simple as possible. Instead of this (i.e. keeping @func 2nd): 0000000000010750 <qht_lookup>: 10750: 89 d1 mov %edx,%ecx 10752: 48 89 f2 mov %rsi,%rdx 10755: 48 8b 77 08 mov 0x8(%rdi),%rsi 10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom> 1075e: 66 90 xchg %ax,%ax We get: 0000000000010740 <qht_lookup>: 10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx 10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom> 10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15tcg/i386: Use byte form of xgetbv instructionJohn Arbuckle1-1/+4
The assembler in most versions of Mac OS X is pretty old and does not support the xgetbv instruction. To go around this problem, the raw encoding of the instruction is used instead. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Message-Id: <20180604215102.11002-1-programmingkidx@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15Merge remote-tracking branch ↵Peter Maydell2-12/+4
'remotes/edgar/tags/edgar/xilinx-next-2018-06-15.for-upstream' into staging xilinx-next-2018-06-15.for-upstream # gpg: Signature made Fri 15 Jun 2018 15:32:47 BST # gpg: using RSA key 29C596780F6BCA83 # gpg: Good signature from "Edgar E. Iglesias (Xilinx key) <edgar.iglesias@xilinx.com>" # gpg: aka "Edgar E. Iglesias <edgar.iglesias@gmail.com>" # Primary key fingerprint: AC44 FEDC 14F7 F1EB EDBF 4151 29C5 9678 0F6B CA83 * remotes/edgar/tags/edgar/xilinx-next-2018-06-15.for-upstream: target-microblaze: Rework NOP/zero instruction handling target-microblaze: mmu: Correct masking of output addresses Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell55-1668/+1692
Block layer patches: - Fix options that work only with -drive or -blockdev, but not with both, because of QDict type confusion - rbd: Add options 'auth-client-required' and 'key-secret' - Remove deprecated -drive options serial/addr/cyls/heads/secs/trans - rbd, iscsi: Remove deprecated 'filename' option - Fix 'qemu-img map' crash with unaligned image size - Improve QMP documentation for jobs # gpg: Signature made Fri 15 Jun 2018 15:20:03 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (26 commits) block: Remove dead deprecation warning code block: Remove deprecated -drive option serial block: Remove deprecated -drive option addr block: Remove deprecated -drive geometry options rbd: New parameter key-secret rbd: New parameter auth-client-required block: Fix -blockdev / blockdev-add for empty objects and arrays check-block-qdict: Cover flattening of empty lists and dictionaries check-block-qdict: Rename qdict_flatten()'s variables for clarity block-qdict: Simplify qdict_is_list() some block-qdict: Clean up qdict_crumple() a bit block-qdict: Tweak qdict_flatten_qdict(), qdict_flatten_qlist() block-qdict: Simplify qdict_flatten_qdict() block: Make remaining uses of qobject input visitor more robust block: Factor out qobject_input_visitor_new_flat_confused() block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts() block: Fix -drive for certain non-string scalars block: Fix -blockdev for certain non-string scalars qobject: Move block-specific qdict code to block-qdict.c block: Add block-specific QDict header ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15Merge remote-tracking branch ↵Peter Maydell48-363/+4114
'remotes/pmaydell/tags/pull-target-arm-20180615' into staging target-arm and miscellaneous queue: * fix KVM state save/restore for GICv3 priority registers for high IRQ numbers * hw/arm/mps2-tz: Put ethernet controller behind PPC * hw/sh/sh7750: Convert away from old_mmio * hw/m68k/mcf5206: Convert away from old_mmio * hw/block/pflash_cfi02: Convert away from old_mmio * hw/watchdog/wdt_i6300esb: Convert away from old_mmio * hw/input/pckbd: Convert away from old_mmio * hw/char/parallel: Convert away from old_mmio * armv7m: refactor to get rid of armv7m_init() function * arm: Don't crash if user tries to use a Cortex-M CPU without an NVIC * hw/core/or-irq: Support more than 16 inputs to an OR gate * cpu-defs.h: Document CPUIOTLBEntry 'addr' field * cputlb: Pass cpu_transaction_failed() the correct physaddr * CODING_STYLE: Define our preferred form for multiline comments * Add and use new stn_*_p() and ldn_*_p() memory access functions * target/arm: More parts of the upcoming SVE support * aspeed_scu: Implement RNG register * m25p80: add support for two bytes WRSR for Macronix chips * exec.c: Handle IOMMUs being in the path of TCG CPU memory accesses * target/arm: Allow ARMv6-M Thumb2 instructions # gpg: Signature made Fri 15 Jun 2018 15:24:03 BST # gpg: using RSA key 3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20180615: (43 commits) target/arm: Allow ARMv6-M Thumb2 instructions exec.c: Handle IOMMUs in address_space_translate_for_iotlb() iommu: Add IOMMU index argument to translate method iommu: Add IOMMU index argument to notifier APIs iommu: Add IOMMU index concept to IOMMU API m25p80: add support for two bytes WRSR for Macronix chips aspeed_scu: Implement RNG register target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group target/arm: Implement FDUP/DUP target/arm: Implement SVE Integer Compare - Scalars Group target/arm: Implement SVE Predicate Count Group target/arm: Implement SVE Partition Break Group target/arm: Implement SVE Integer Compare - Immediate Group target/arm: Implement SVE Integer Compare - Vectors Group target/arm: Implement SVE Select Vectors Group target/arm: Implement SVE vector splice (predicated) target/arm: Implement SVE reverse within elements target/arm: Implement SVE copy to vector (predicated) target/arm: Implement SVE conditionally broadcast/extract element ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Allow ARMv6-M Thumb2 instructionsJulia Suvorova1-5/+38
ARMv6-M supports 6 Thumb2 instructions. This patch checks for these instructions and allows their execution. Like Thumb2 cores, ARMv6-M always interprets BL instruction as 32-bit. This patch is required for future Cortex-M0 support. Signed-off-by: Julia Suvorova <jusual@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180612204632.28780-1-jusual@mail.ru [PMM: move armv6m_insn[] and armv6m_mask[] closer to point of use, and mark 'const'. Check for M-and-not-v7 rather than M-and-6.] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15exec.c: Handle IOMMUs in address_space_translate_for_iotlb()Peter Maydell4-4/+140
Currently we don't support board configurations that put an IOMMU in the path of the CPU's memory transactions, and instead just assert() if the memory region fonud in address_space_translate_for_iotlb() is an IOMMUMemoryRegion. Remove this limitation by having the function handle IOMMUs. This is mostly straightforward, but we must make sure we have a notifier registered for every IOMMU that a transaction has passed through, so that we can flush the TLB appropriately when any of the IOMMUs change their mappings. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
2018-06-15iommu: Add IOMMU index argument to translate methodPeter Maydell12-13/+24
Add an IOMMU index argument to the translate method of IOMMUs. Since all of our current IOMMU implementations support only a single IOMMU index, this has no effect on the behaviour. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
2018-06-15iommu: Add IOMMU index argument to notifier APIsPeter Maydell7-10/+30
Add support for multiple IOMMU indexes to the IOMMU notifier APIs. When initializing a notifier with iommu_notifier_init(), the caller must pass the IOMMU index that it is interested in. When a change happens, the IOMMU implementation must pass memory_region_notify_iommu() the IOMMU index that has changed and that notifiers must be called for. IOMMUs which support only a single index don't need to change. Callers which only really support working with IOMMUs with a single index can use the result of passing MEMTXATTRS_UNSPECIFIED to memory_region_iommu_attrs_to_index(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
2018-06-15iommu: Add IOMMU index concept to IOMMU APIPeter Maydell2-0/+78
If an IOMMU supports mappings that care about the memory transaction attributes, then it no longer has a unique address -> output mapping, but more than one. We can represent these using an IOMMU index, analogous to TCG's mmu indexes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-2-peter.maydell@linaro.org
2018-06-15m25p80: add support for two bytes WRSR for Macronix chipsCédric Le Goater1-0/+1
On Macronix chips, two bytes can written to the WRSR. First byte will configure the status register and the second the configuration register. It is important to save the configuration value as it contains the dummy cycle setting when using dual or quad IO mode. Signed-off-by: Cédric Le Goater <clg@kaod.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15aspeed_scu: Implement RNG registerJoel Stanley1-0/+20
The ASPEED SoCs contain a single register that returns random data when read. This models that register so that guests can use it. The random number data register has a corresponding control register, however it returns data regardless of the state of the enabled bit, so the model follows this behaviour. When the qcrypto call fails we exit as the guest uses the random number device to feed it's entropy pool, which is used for cryptographic purposes. Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Message-id: 20180613114836.9265-1-joel@jms.id.au Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Floating Point Arithmetic - Unpredicated GroupRichard Henderson5-0/+154
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-19-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Integer Wide Immediate - Unpredicated GroupRichard Henderson4-0/+236
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-18-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement FDUP/DUPRichard Henderson2-0/+45
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-17-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Integer Compare - Scalars GroupRichard Henderson4-0/+140
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-16-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Predicate Count GroupRichard Henderson4-0/+176
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Partition Break GroupRichard Henderson4-0/+391
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-14-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Integer Compare - Immediate GroupRichard Henderson4-0/+221
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-13-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Integer Compare - Vectors GroupRichard Henderson4-0/+417
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-12-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Select Vectors GroupRichard Henderson4-0/+72
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-11-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE vector splice (predicated)Richard Henderson4-0/+55
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-10-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE reverse within elementsRichard Henderson4-7/+93
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-9-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE copy to vector (predicated)Richard Henderson2-0/+25
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-8-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE conditionally broadcast/extract elementRichard Henderson4-0/+362
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-7-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE compress active elementsRichard Henderson4-0/+55
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-6-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Permute - Interleaving GroupRichard Henderson4-0/+172
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Permute - Predicates GroupRichard Henderson4-0/+434
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Implement SVE Permute - Unpredicated GroupRichard Henderson4-0/+297
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15target/arm: Extend vec_reg_offset to larger sizesRichard Henderson1-9/+17
Rearrange the arithmetic so that we are agnostic about the total size of the vector and the size of the element. This will allow us to index up to the 32nd byte and with 16-byte elements. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180613015641.5667-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15exec.c: Use stn_p() and ldn_p() instead of explicit switchesPeter Maydell1-104/+8
Now we have stn_p() and ldn_p() we can use them in various functions in exec.c that used to have their own switch-on-size code. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180611171007.4165-4-peter.maydell@linaro.org
2018-06-15exec.c: Don't accidentally sign-extend 4-byte loads in subpage_read()Peter Maydell1-1/+1
In subpage_read() we perform a load of the data into a local buffer which we then access using ldub_p(), lduw_p(), ldl_p() or ldq_p() depending on its size, storing the result into the uint64_t *data. Since ldl_p() returns an 'int', this means that for the 4-byte case we will sign-extend the data, whereas for 1 and 2 byte reads we zero-extend it. This ought not to matter since the caller will likely ignore values in the high bytes of the data, but add a cast so that we're consistent. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180611171007.4165-3-peter.maydell@linaro.org
2018-06-15bswap: Add new stn_*_p() and ldn_*_p() memory access functionsPeter Maydell3-0/+71
There's a common pattern in QEMU where a function needs to perform a data load or store of an N byte integer in a particular endianness. At the moment this is handled by doing a switch() on the size and calling the appropriate ld*_p or st*_p function for each size. Provide a new family of functions ldn_*_p() and stn_*_p() which take the size as an argument and do the switch() themselves. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180611171007.4165-2-peter.maydell@linaro.org