aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2016-05-19block: Remove BlockDriverState.blkKevin Wolf1-2/+1
This patch removes the remaining users of bs->blk, which will allow us to have multiple BBs on top of a single BDS. In the meantime, all checks that are currently in place to prevent the user from creating such setups can be switched to bdrv_has_blk() instead of accessing BDS.blk. Future patches can allow them and e.g. enable users to mirror to a block device that already has a BlockBackend on it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Avoid bs->blk in bdrv_next()Kevin Wolf2-2/+2
We need to introduce a separate BdrvNextIterator struct that can keep more state than just the current BDS in order to avoid using the bs->blk pointer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Add bdrv_has_blk()Kevin Wolf1-0/+1
In many cases we just want to know whether a BDS has at least one BB attached, without needing to know the exact BB that is attached. In contrast to bs->blk, this is still a valid question when more than one BB can be attached, so just answer it by checking the parents list. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Remove bdrv_aio_multiwrite()Kevin Wolf2-6/+2
Since virtio-blk implements request merging itself these days, the only remaining users are test cases for the function. That doesn't make the function exactly useful any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-19blockjob: Don't set iostatus of targetKevin Wolf1-3/+1
When block job errors were introduced, we assigned the iostatus of the target BDS "just in case". The field has never been accessible for the user because the target isn't listed in query-block. Before we can allow the user to have a second BlockBackend on the target, we need to clean this up. If anything, we would want to set the iostatus for the internal BB of the job (which we can always do later), but certainly not for a separate BB which the job doesn't even use. As a nice side effect, this gets us rid of another bs->blk use. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: User BdrvChild callback for device nameKevin Wolf1-0/+5
In order to get rid of bs->blk for bdrv_get_device_name() and bdrv_get_device_or_node_name(), ask all parents for their name and simply pick the first one. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Use BdrvChild callbacks for change_media/resizeKevin Wolf1-1/+3
We want to get rid of BlockDriverState.blk in order to allow multiple BlockBackends per BDS. Converting the device callbacks in block.c (which assume a single BlockBackend) to per-child callbacks gets us rid of the first few instances. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Decouple throttling from BlockDriverStateKevin Wolf1-3/+0
This moves the throttling related part of the BDS life cycle management to BlockBackend. The throttling group reference is now kept even when no medium is inserted. With this commit, throttling isn't disabled and then re-enabled any more during graph reconfiguration. This fixes the temporary breakage of I/O throttling when used with live snapshots or block jobs that manipulate the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Drain throttling queue with BdrvChild callbackKevin Wolf1-5/+11
This removes the last part of I/O throttling from block/io.c and moves it to the BlockBackend. Instead of having knowledge about throttling inside io.c, we can call a BdrvChild callback .drained_begin/end, which happens to drain the throttled requests for BlockBackend parents. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Introduce BdrvChild.opaqueKevin Wolf1-0/+1
BlockBackends use it to get a back pointer from BdrvChild to BlockBackend in any BdrvChildRole callbacks. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move I/O throttling configuration functions to BlockBackendKevin Wolf4-8/+8
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move actual I/O throttling to BlockBackendKevin Wolf1-1/+1
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move throttling fields from BDS to BBKevin Wolf3-15/+11
This patch changes where the throttling state is stored (used to be the BlockDriverState, now it is the BlockBackend), but it doesn't actually make it a BB level feature yet. For example, throttling is still disabled when the BDS is detached from the BB. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Convert throttle_group_get_name() to BlockBackendKevin Wolf1-1/+1
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: throttle-groups: Use BlockBackend pointers internallyKevin Wolf3-4/+6
As a first step towards moving I/O throttling to the BlockBackend level, this patch changes all pointers in struct ThrottleGroup from referencing a BlockDriverState to referencing a BlockBackend. This change is valid because we made sure that throttling can only be enabled on BDSes which have a BB attached. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Introduce BlockBackendPublicKevin Wolf1-0/+10
Some features, like I/O throttling, are implemented outside block-backend.c, but still want to keep information in BlockBackend, e.g. list entries that allow keeping a list of BlockBackends. In order to avoid exposing the whole struct layout in the public header file, this patch introduces an embedded public struct where such information can be added and a pair of functions to convert between BlockBackend and BlockBackendPublic. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-18Fix some typos found by codespellStefan Weil4-4/+4
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18smbios: fix typoCao jin1-1/+1
The spec says: "on paragraph (16-byte) boundaries" Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18accel: make configure_accelerator return voidWei Jiangang1-1/+1
Return the negated value of accel_initialised is meaningless, and the caller vl doesn't check it. Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18ipack: Update e-mail addressAlberto Garcia1-1/+1
I'm not really using the old one anymore. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18remove comment for nonexistent structure memberCao jin1-1/+0
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-17s390x: enable runtime instrumentationFan Zhang1-0/+4
Introduce run-time-instrumentation support when running under kvm for virtio-ccw 2.7 machine and make sure older machines can not enable it. The new ri_allowed field in the s390MachineClass serves as an indicator whether the feature can be used by the machine and should therefore be activated if available. riccb_needed() is used to check whether riccb is needed or not in live migration. Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17s390x: add compat machine for 2.7Cornelia Huck1-0/+3
Also add some of the option cascading we were missing. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-13Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160513-1' into ↵Peter Maydell1-0/+1
staging gtk/sdl build tweaks fix gtk 3.20 warnings gtk clipboard support spice-gl monitor config support fix coverity warnings # gpg: Signature made Fri 13 May 2016 13:30:39 BST using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-ui-20160513-1: gtk: don't leak the GtkBorder with VTE 0.36 gtk: update grab code for gtk 3.20 spice: fix coverity complains egl-helpers: fix possible resource leak Changed malloc to g_malloc, free to g_free in ui/shader.c spice/gl: add & use qemu_spice_gl_monitor_config ui/gtk: copy to clipboard support ui: gtk: Fix some deprecation warnings ui: gtk: Fix a runtime warning on vte >= 0.37 configure: support vte-2.91 configure: report SDL version configure: report GTK version configure: add echo_version helper configure: error on unknown --with-sdlabi value configure: build SDL if only SDL2 available ui: sdl2: Release grab before opening console window ui: gtk: fix crash when terminal inner-border is NULL Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12tcg: Remove needless CPUState::current_tbSergey Fedorov1-2/+0
This field was used for telling cpu_interrupt() to unlink a chain of TBs being executed when it worked that way. Now, cpu_interrupt() don't do this anymore. So we don't need this field anymore. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1462273462-14036-1-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Rework tb_invalidated_flagSergey Fedorov2-2/+2
'tb_invalidated_flag' was meant to catch two events: * some TB has been invalidated by tb_phys_invalidate(); * the whole translation buffer has been flushed by tb_flush(). Then it was checked: * in cpu_exec() to ensure that the last executed TB can be safely linked to directly call the next one; * in cpu_exec_nocache() to decide if the original TB should be provided for further possible invalidation along with the temporarily generated TB. It is always safe to patch an invalidated TB since it is not going to be used anyway. It is also safe to call tb_phys_invalidate() for an already invalidated TB. Thus, setting this flag in tb_phys_invalidate() is simply unnecessary. Moreover, it can prevent from pretty proper linking of TBs, if any arbitrary TB has been invalidated. So just don't touch it in tb_phys_invalidate(). If this flag is only used to catch whether tb_flush() has been called then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using only 'true' and 'false' to set its value. Also, instead of setting it in tb_gen_code(), just after tb_flush() has been called, do it right inside of tb_flush(). In cpu_exec(), this flag is used to track if tb_flush() has been called and have made 'next_tb' (a reference to the last executed TB) invalid for linking it to directly call the next TB. tb_flush() can be called during the CPU execution loop from tb_gen_code(), during TB execution or by another thread while 'tb_lock' is released. Catch for translation buffer flush reliably by resetting this flag once before first TB lookup and each time we find it set before trying to add a direct jump. Don't touch in in tb_find_physical(). Each vCPU has its own execution loop in multithreaded mode and thus should have its own copy of the flag to be able to reset it with its own 'next_tb' and don't affect any other vCPU execution thread. So make this flag per-vCPU and move it to CPUState. In cpu_exec_nocache(), we only need to check if tb_flush() has been called from tb_gen_code() called by cpu_exec_nocache() itself. To do this reliably, preserve the old value of the flag, reset it before calling tb_gen_code(), check afterwards, and combine the saved value back to the flag. This patch is based on the patch "tcg: move tb_invalidated_flag to CPUState" from Paolo Bonzini <pbonzini@redhat.com>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Clarify thread safety check in tb_add_jump()Sergey Fedorov1-13/+16
The check is to make sure that another thread hasn't already done the same while we were outside of tb_lock. Mention this in a comment. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Use uintptr_t type for jmp_list_{next|first} fields of TBSergey Fedorov1-5/+7
These fields do not contain pure pointers to a TranslationBlock structure. So uintptr_t is the most appropriate type for them. Also put some asserts to assure that the two least significant bits of the pointer are always zero before assigning it to jmp_list_first. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Clean up direct block chaining data fieldsSergey Fedorov1-16/+28
Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Note requirement on atomic direct jump patchingSergey Fedorov1-0/+1
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/arm: Make direct jump patching thread-safeSergey Fedorov1-23/+2
Ensure direct jump patching in ARM is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/s390: Make direct jump patching thread-safeSergey Fedorov1-1/+1
Ensure direct jump patching in s390 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/i386: Make direct jump patching thread-safeSergey Fedorov1-1/+1
Ensure direct jump patching in i386 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. tcg_out_nopn() implementation: Suggested-by: Richard Henderson <rth@twiddle.net>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tci: Make direct jump patching thread-safeSergey Fedorov1-1/+1
Ensure direct jump patching in TCI is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() to load/store the address. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add macros for pointer alignmentSergey Fedorov1-0/+11
These macros provide a convenient way to n-byte align pointers up and down and check if a pointer is n-byte aligned. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-3-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add a macro to check for alignmentSergey Fedorov1-0/+3
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-2-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tb: consistently use uint32_t for tb->flagsEmilio G. Cota1-2/+3
We are inconsistent with the type of tb->flags: usage varies loosely between int and uint64_t. Settle to uint32_t everywhere, which is superior to both: at least one target (aarch64) uses the most significant bit in the u32, and uint64_t is wasteful. Compile-tested for all targets. Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: Richard Henderson <rth@twiddle.net> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell5-37/+58
Block layer patches # gpg: Signature made Thu 12 May 2016 14:37:05 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (69 commits) qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 block: Inactivate all children block: Drop superfluous invalidating bs->file from drivers block: Invalidate all children nbd: Simplify client FUA handling block: Honor BDRV_REQ_FUA during write_zeroes ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch ↵Peter Maydell7-10/+850
'remotes/pmaydell/tags/pull-target-arm-20160512' into staging target-arm queue: * blizzard, omap_lcdc: code cleanup to remove DEPTH != 32 dead code * QOMify various ARM devices * bcm2835_property: use cached values when querying framebuffer * hw/arm/nseries: don't allocate large sized array on the stack * fix LPAE descriptor address masking (only visible for EL2) * fix stage 2 exec permission handling for AArch32 * first part of supporting syndrome info for data aborts to EL2 * virt: NUMA support * work towards i.MX6 support * avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes # gpg: Signature made Thu 12 May 2016 14:29:14 BST using RSA key ID 14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" * remotes/pmaydell/tags/pull-target-arm-20160512: (43 commits) hw/arm: QOM'ify versatilepb.c hw/arm: QOM'ify strongarm.c hw/arm: QOM'ify stellaris.c hw/arm: QOM'ify spitz.c hw/arm: QOM'ify pxa2xx_pic.c hw/arm: QOM'ify pxa2xx.c hw/arm: QOM'ify integratorcp.c hw/arm: QOM'ify highbank.c hw/arm: QOM'ify armv7m.c target-arm: Avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes hw/display/blizzard: Remove blizzard_template.h hw/display/blizzard: Expand out macros i.MX: Add sabrelite i.MX6 emulation. i.MX: Add i.MX6 SOC implementation. i.MX: Add the Freescale SPI Controller FIFO: Add a FIFO32 implementation i.MX: Add i.MX6 System Reset Controller device. ARM: Factor out ARM on/off PSCI control functions ACPI: Virt: Generate SRAT table ACPI: move acpi_build_srat_memory to common place ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12spice/gl: add & use qemu_spice_gl_monitor_configGerd Hoffmann1-0/+1
Cc: qemu-stable@nongnu.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-05-12quorum: implement bdrv_add_child() and bdrv_del_child()Wen Congyang1-0/+4
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Message-id: 1462865799-19402-3-git-send-email-xiecl.fnst@cn.fujitsu.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12Add new block driver interface to add/delete a BDS's childWen Congyang2-0/+9
In some cases, we want to take a quorum child offline, and take another child online. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1462865799-19402-2-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: Honor BDRV_REQ_FUA during write_zeroesEric Blake1-2/+5
The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but back then, it did not matter because there was no FUA flag. It became observable when commit 93f5e6d8 paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit 9eeb6dd, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Make supported_write_flags a per-bds propertyEric Blake1-2/+2
Pre-patch, .supported_write_flags lives at the driver level, which means we are blindly declaring that all block devices using a given driver will either equally support FUA, or that we need a fallback at the block layer. But there are drivers where FUA support is a per-block decision: the NBD block driver is dependent on the remote server advertising NBD_FLAG_SEND_FUA (and has fallback code to duplicate the flush that the block layer would do if NBD had not set .supported_write_flags); and the iscsi block driver is dependent on the mode sense bits advertised by the underlying device (and is currently silently ignoring FUA requests if the underlying device does not support FUA). The fix is to make supported flags as a per-BDS option, set during .bdrv_open(). This patch moves the variable and fixes NBD and iscsi to set it only conditionally; later patches will then further simplify the NBD driver to quit duplicating work done at the block layer, as well as tackle the fact that SCSI does not support FUA semantics on WRITESAME(10/16) but only on WRITE(10/16). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Kill unused sector-based blk_* functionsEric Blake1-10/+0
Now that there are no remaining clients, we can drop the sector-based blk_read(), blk_write(), blk_aio_readv(), and blk_aio_writev(). Sadly, there are still remaining sector-based interfaces, such as blk_*discard(), or blk_write_compressed(); those will have to wait for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12ide: Switch to byte-based aio block accessEric Blake1-2/+2
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch to byte-based blk_aio_preadv() and blk_aio_pwritev() instead. The patch had to touch multiple files at once, because dma_blk_io() takes pointers to the functions, and ide_issue_trim() piggybacks on the same interface (while ignoring offset under the hood). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Introduce byte-based aio read/writeEric Blake1-1/+7
blk_aio_readv() and blk_aio_writev() are annoying in that they can't access sub-sector granularity, and cannot pass flags. Also, they require the caller to pass redundant information about the size of the I/O (qiov->size in bytes must match nb_sectors in sectors). Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix the flaws. The next few patches will upgrade callers, then finally delete the old interfaces. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_*write_zeroes() to byte interfaceEric Blake1-6/+6
Sector-based blk_write() should die; convert the one-off variant blk_write_zeroes() to use an offset/count interface instead. Likewise for blk_co_write_zeroes() and blk_aio_write_zeroes(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_read_unthrottled() to byte interfaceEric Blake1-2/+2
Sector-based blk_read() should die; convert the one-off variant blk_read_unthrottled(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Allow BDRV_REQ_FUA through blk_pwrite()Eric Blake1-1/+2
We have several block drivers that understand BDRV_REQ_FUA, and emulate it in the block layer for the rest by a full flush. But without a way to actually request BDRV_REQ_FUA during a pass-through blk_pwrite(), FUA-aware block drivers like NBD are forced to repeat the emulation logic of a full flush regardless of whether the backend they are writing to could do it more efficiently. This patch just wires up a flags argument; followup patches will actually make use of it in the NBD driver and in qemu-io. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>