aboutsummaryrefslogtreecommitdiff
path: root/block/qcow2-cluster.c
AgeCommit message (Collapse)AuthorFilesLines
2013-03-28qcow2: handle_alloc(): Get rid of keep_clusters parameterKevin Wolf1-17/+27
handle_alloc() is now called with the offset at which the actual new allocation starts instead of the offset at which the whole write request starts, part of which may already be processed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: handle_alloc(): Get rid of nb_clusters parameterKevin Wolf1-4/+15
We already communicate the same information in *bytes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Factor out handle_alloc()Kevin Wolf1-89/+151
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Decouple cluster allocation from cluster reuse codeKevin Wolf1-15/+20
This moves some code that prepares the allocation of new clusters to where the actual allocation happens. This is the minimum required to be able to move it to a separate function in the next patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Change handle_dependency to byte granularityKevin Wolf1-12/+28
This is a more precise description of what really constitutes a dependency. The behaviour doesn't change at this point because the COW area of the old request is still aligned to cluster boundaries and therefore an overlap is detected wheneven the requests touch any part of the same cluster. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Improve check for overlapping allocationsKevin Wolf1-1/+1
The old code detected an overlapping allocation even when the allocations didn't actually overlap, but were only adjacent. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Handle dependencies earlierKevin Wolf1-16/+43
Handling overlapping allocations isn't just a detail of cluster allocation. It is rather one of three ways to get the host cluster offset for a write request: 1. If a request overlaps an in-flight allocations, the cluster offset can be taken from there (this is what handle_dependencies will evolve into) or the request must just wait until the allocation has completed. Accessing the L2 is not valid in this case, it has outdated information. 2. Outside overlapping areas, check the clusters that can be written to as they are, with no COW involved. 3. If a COW is required, allocate new clusters Changing the code to reflect this doesn't change the behaviour because overlaps cannot exist for clusters that are kept in step 2. It does however make it easier for later patches to work on clusters that belong to an allocation that is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15qcow2: make is_allocated return true for zero clustersPaolo Bonzini1-0/+3
Otherwise, live migration of the top layer will miss zero clusters and let the backing file show through. This also matches what is done in qed. QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this directly in qcow2_get_cluster_offset instead of replicating the test everywhere. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15qcow2: Allow lazy refcounts to be enabled on the command lineKevin Wolf1-1/+1
qcow2 images now accept a boolean lazy_refcounts options. Use it like this: -drive file=test.qcow2,lazy_refcounts=on If the option is specified on the command line, it overrides the default specified by the qcow2 header flags that were set when creating the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-13qcow2: Factor out handle_dependencies()Kevin Wolf1-28/+42
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2Kevin Wolf1-1/+4
This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Allocate l2meta only for cluster allocationsKevin Wolf1-14/+9
Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Drop l2meta.cluster_offsetKevin Wolf1-4/+6
There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Introduce Qcow2COWRegionKevin Wolf1-30/+53
This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Round QCowL2Meta.offset down to cluster boundaryKevin Wolf1-2/+2
The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06qcow2: implement lazy refcountsStefan Hajnoczi1-1/+4
Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: Fix avail_sectors in cluster allocation codeKevin Wolf1-1/+9
avail_sectors should really be the number of sectors from the start of the allocation, not from the start of the write request. We're lucky enough that this mistake didn't cause any real bug. avail_sectors is only used in the intialiser of QCowL2Meta: .nb_available = MIN(requested_sectors, avail_sectors), m->nb_available in turn is only used for COW at the end of the allocation. A COW occurs only if the request wasn't cluster aligned, which in turn would imply that requested_sectors was less than avail_sectors (both in the original and in the fixed version). In this case avail_sectors is ignored and therefore the mistake doesn't cause any misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: Simplify calculation for COW area at the endKevin Wolf1-3/+2
copy_sectors() always uses the sum (cluster_offset + n_start) or (start_sect + n_start), so if some value is added to both cluster_offset and start_sect, and subtracted from n_start, it's cancelled out anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: remove a line of unnecessary codeZhi Yong Wu1-1/+0
Commit 3948d1d4 removed the pointer argument we filled in with l2_offset but forgot to remove the unnecessary l2_offset assignment. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: Silence false warningKevin Wolf1-0/+2
Some gcc versions seem not to be able to figure out that the switch statement covers all possible values and that c is therefore always initialised. Add a default branch for them. Reported-by: malc <av1474@comtv.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: malc <av1474@comtv.ru>
2012-05-25qcow2: Check qcow2_alloc_clusters_at() return valueKevin Wolf1-10/+13
When using qcow2_alloc_clusters_at(), the cluster allocation code checked the wrong variable for an error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07qcow2: Limit COW to where it's neededKevin Wolf1-5/+9
This fixes a regression introduced in commit 250196f1. The bug leads to data corruption, found during an Autotest run with a Fedora 8 guest. Consider a write request whose first part is covered by an already allocated cluster, but additional clusters need to be newly allocated. When counting the number of clusters to allocate, the qcow2 code would decide to do COW for all remaining clusters of the write request, even if some of them are already allocated. If during this COW operation another write request is issued that touches the same cluster, it will still refer to the old cluster. When the COW completes, the first request will update the L2 table and the second write request will be lost. Note that the requests need not overlap, it's enough for them to touch the same cluster. This patch ensures that only clusters that really require COW are considered for allocation. In this case any other request writing to the same cluster will be an allocating write and gets serialised. Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02qcow2: Don't hold cache references across yieldKevin Wolf1-8/+13
If cache references are held while the coroutine has yielded, the cache may get used up and abort() when it can't find a free entry. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02qcow2: Remove unused parameter in do_alloc_cluster_offsetKevin Wolf1-2/+2
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Zero write supportKevin Wolf1-0/+72
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Support reading zero clustersKevin Wolf1-4/+13
This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Simplify count_cow_clustersKevin Wolf1-18/+15
count_cow_clusters() tries to reuse existing functions, and all it achieves is to make things much more complicated than they really are: Everything needs COW, unless it's a normal cluster with refcount 1. This patch implements the obvious way of doing this, and by using qcow2_get_cluster_type() it gets rid of all flag magic. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in L1/L2 entriesKevin Wolf1-13/+13
This changes the still existing places that assume that the only flags are QCOW_OFLAG_COPIED and QCOW_OFLAG_COMPRESSED to properly mask out reserved bits. It does not convert bdrv_check yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Fail write_compressed when overwriting dataKevin Wolf1-4/+3
qcow2_alloc_compressed_cluster_offset() already fails if the copied flag is set, because qcow2_write_compressed() doesn't perform COW as it would have to do to allow this. However, what we really want to check here is whether the cluster is allocated or not. With internal snapshots the copied flag may not be set on allocated clusters. Check the cluster offset instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in count_contiguous_clusters()Kevin Wolf1-10/+28
Until now, count_contiguous_clusters() has an argument that allowed to specify flags that should be ignored in the comparison, i.e. that are allowed to change between contiguous clusters. This patch changes the function so that it ignores all flags by default now and you need to pass the flags on which it should stop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in get_cluster_offsetKevin Wolf1-16/+25
With this change, reading from a qcow2 image ignores all reserved bits that are set in an L1 or L2 table entry. Now get_cluster_offset() assigns *cluster_offset only the offset without any other flags. The cluster type is not longer encoded in the offset, but a positive return value in case of success. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-19qcow2: Fix error handling in qcow2_alloc_cluster_offsetKevin Wolf1-1/+1
If do_alloc_cluster_offset() fails, the error handling code tried to remove the request from the in-flight queue, to which it wasn't added yet, resulting in a NULL pointer dereference. m->nb_clusters really only becomes != 0 when the request is in the list. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05qcow2: Remove unused parameter in get_cluster_table()Kevin Wolf1-10/+8
Since everything goes through the cache, callers don't use the L2 table offset any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12qcow2: Reduce number of I/O requestsKevin Wolf1-77/+166
If the first part of a write request is allocated, but the second isn't and it can be allocated so that the resulting area is contiguous, handle it at once. This is a common case for sequential writes. After this patch, alloc_cluster_offset() only checks if the clusters are already allocated or how many new clusters can be allocated contigouosly. The actual cluster allocation is split off into a new function do_alloc_cluster_offset(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12qcow2: Factor out count_cow_clustersKevin Wolf1-19/+36
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12qcow2: Add some tracingKevin Wolf1-1/+14
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05qcow2: avoid reentrant bdrv_read() in copy_sectors()Stefan Hajnoczi1-8/+19
A BlockDriverState should not issue requests on itself through the public block layer interface. Nested, or reentrant, requests are problematic because they do I/O throttling and request tracking twice. Features like block layer copy-on-read use request tracking to avoid race conditions between concurrent requests. The reentrant request will have to "wait" for its parent request to complete. But the parent is waiting for the reentrant request to make progress so we have reached deadlock. The solution is for block drivers to avoid the public block layer interfaces for reentrant requests. Instead they should call their own internal functions if they wish to perform reentrant requests. This is also a good opportunity to make copy_sectors() a true coroutine_fn. That means calling bdrv_co_writev() instead of bdrv_write(). Behavior is unchanged but we're being explicit that this executes in coroutine context. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05qcow2: Unlock during COWKevin Wolf1-69/+35
Unlocking during COW allows for more parallelism. One change it requires is that buffers are dynamically allocated instead of just using a per-image buffer. While touching the code, drop the synchronous qcow2_read() function and replace it by a bdrv_read() call. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21qcow2: Fix bdrv_write_compressed error handlingKevin Wolf1-2/+4
If during allocation of compressed clusters the cluster was already allocated uncompressed, fail and properly release the l2_table (the latter avoids a failed assertion). While at it, make it return some real error numbers instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
2011-09-12qcow2: fix range checkFrediano Ziglio1-7/+7
QCowL2Meta::offset is not cluster aligned but only sector aligned however nb_clusters count cluster from cluster start. This fix range check. Note that old code have no corruption issues related to this check cause it only cause intersection to occur when shouldn't. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12qcow2: initialize metadata before inserting in cluster_allocsFrediano Ziglio1-5/+5
QCow2Meta structure was inserted into list before many fields are initialized. Currently is not a problem cause all occur in a lock but if qcow2_alloc_clusters would in a future unlock this lock some issues could arise. Initializing fields before inserting fix the problem. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12qcow2: removed unused depends_on fieldFrediano Ziglio1-2/+1
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25qcow2: use always stderr for debuggingFrediano Ziglio1-1/+1
let all DEBUG_ALLOC2 printf goes to stderr Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: fix typo in documentation for qcow2_get_cluster_offset()Devin Nakamura1-2/+2
Documentation states the num is measured in clusters, but its actually measured in sectors Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-20Use glib memory allocation and free functionsAnthony Liguori1-6/+6
qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-02qcow2: Use coroutinesKevin Wolf1-14/+12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-15qcow2: Fix in-flight list after qcow2_cache_put failureKevin Wolf1-4/+8
If qcow2_cache_put returns an error during cluster allocation and the allocation fails, it must be removed from the list of in-flight allocations. Otherwise we'd get a loop in the list when the ACB is used for the next allocation. Luckily, this qcow2_cache_put shouldn't fail anyway because the L2 table is only read, so that qcow2_cache_put doesn't even involve I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-06-08qcow2: Fix memory leaks in error casesKevin Wolf1-1/+1
This fixes memory leaks that may be caused by I/O errors during L1 table growth (can happen during save_vm) and in qemu-img check. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10qcow2: Fix order in L2 table COWKevin Wolf1-3/+6
When copying L2 tables (this happens only with internal snapshots), the order wasn't completely safe, so that after a crash you could end up with a L2 table that has too low refcount, possibly leading to corruption in the long run. This patch puts the operations in the right order: First allocate the new L2 table and replace the reference, and only then decrease the refcount of the old table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>