aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-11-18Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' ↵Peter Maydell2-37/+36
into staging # gpg: Signature made Tue 18 Nov 2014 15:04:14 GMT using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/tracing-pull-request: Tracing: Fix simpletrace.py error on tcg enabled binary traces Tracing docs fix configure option and description Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18Tracing: Fix simpletrace.py error on tcg enabled binary tracesChristoph Seifert1-34/+33
simpletrace.py does not recognize the tcg option while reading trace-events file. In result simpletrace does not work on binary traces and tcg enabled events. Moved transformation of tcg enabled events to _read_events() which is used by simpletrace. Signed-off-by: Christoph Seifert <christoph.seifert@posteo.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-18Tracing docs fix configure option and descriptionDr. David Alan Gilbert1-3/+3
Fix the example trace configure option. Update the text to say that multiple backends are allowed and what happens when multiple backends are enabled. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 1412691161-31785-1-git-send-email-dgilbert@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-18Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell1-82/+91
Block patches for 2.2.0-rc2 # gpg: Signature made Tue 18 Nov 2014 11:32:55 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: block/raw-posix: Catch fsync() errors block/raw-posix: Only sync after successful preallocation block/raw-posix: Fix preallocating write() loop raw-posix: The SEEK_HOLE code is flawed, rewrite it raw-posix: SEEK_HOLE suffices, get rid of FIEMAP raw-posix: Fix comment for raw_co_get_block_status() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18Merge remote-tracking branch 'remotes/amit-migration/tags/for-2.2' into stagingPeter Maydell1-2/+3
Fix for CVE-2014-7840, avoiding arbitrary qemu memory overwrite for migration by Michael S. Tsirkin. # gpg: Signature made Tue 18 Nov 2014 11:23:00 GMT using RSA key ID 854083B6 # gpg: Good signature from "Amit Shah <amit@amitshah.net>" # gpg: aka "Amit Shah <amit@kernel.org>" # gpg: aka "Amit Shah <amitshah@gmx.net>" * remotes/amit-migration/tags/for-2.2: migration: fix parameter validation on ram load Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18linux-headers: update to 3.18-rc5Ard Biesheuvel6-11/+40
This updates the Linux header to version 3.18-rc5, adding support for (among other things) read-only memslots on ARM and arm64. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Message-id: 1416248898-6302-1-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-18migration: fix parameter validation on ram loadMichael S. Tsirkin1-2/+3
During migration, the values read from migration stream during ram load are not validated. Especially offset in host_from_stream_offset() and also the length of the writes in the callers of said function. To fix this, we need to make sure that the [offset, offset + length] range fits into one of the allocated memory regions. Validating addr < len should be sufficient since data seems to always be managed in TARGET_PAGE_SIZE chunks. Fixes: CVE-2014-7840 Note: follow-up patches add extra checks on each block->host access. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2014-11-18block/raw-posix: Catch fsync() errorsMax Reitz1-1/+6
fsync() may fail, and that case should be handled. Reported-by: László Érsek <lersek@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18block/raw-posix: Only sync after successful preallocationMax Reitz1-1/+3
The loop which filled the file with zeroes may have been left early due to an error. In that case, the fsync() should be skipped. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18block/raw-posix: Fix preallocating write() loopMax Reitz1-1/+1
write() may write less bytes than requested; in this case, the number of bytes written is returned. This is the byte count we should be subtracting from the number of bytes still to be written, and not the byte count we requested to write. Reported-by: László Érsek <lersek@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18exec: Handle multipage ranges in invalidate_and_set_dirty()Peter Maydell2-4/+27
The code in invalidate_and_set_dirty() needs to handle addr/length combinations which cross guest physical page boundaries. This can happen, for example, when disk I/O reads large blocks into guest RAM which previously held code that we have cached translations for. Unfortunately we were only checking the clean/dirty status of the first page in the range, and then were calling a tb_invalidate function which only handles ranges that don't cross page boundaries. Fix the function to deal with multipage ranges. The symptoms of this bug were that guest code would misbehave (eg segfault), in particular after a guest reboot but potentially any time the guest reused a page of its physical RAM for new code. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
2014-11-18Merge remote-tracking branch 'mreitz/block' into queue-blockKevin Wolf1-80/+82
* mreitz/block: raw-posix: The SEEK_HOLE code is flawed, rewrite it raw-posix: SEEK_HOLE suffices, get rid of FIEMAP raw-posix: Fix comment for raw_co_get_block_status()
2014-11-18raw-posix: The SEEK_HOLE code is flawed, rewrite itMarkus Armbruster1-26/+85
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris, but not Linux), try_seek_hole() reports trailing data instead. Additionally, unlikely lseek() failures are treated badly: * When SEEK_HOLE fails, try_seek_hole() reports trailing data. For -ENXIO, there's in fact a trailing hole. Can happen only when something truncated the file since we opened it. * When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds, then try_seek_hole() reports a trailing hole. This is okay only when SEEK_DATA failed with -ENXIO (which means the non-trailing hole found by SEEK_HOLE has since become trailing somehow). For other failures (unlikely), it's wrong. * When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely), then try_seek_hole() reports bogus data [-1,start), which its caller raw_co_get_block_status() turns into zero sectors of data. Could theoretically lead to infinite loops in code that attempts to scan data vs. hole forward. Rewrite from scratch, with very careful comments. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18raw-posix: SEEK_HOLE suffices, get rid of FIEMAPMarkus Armbruster1-59/+4
Commit 5500316 (May 2012) implemented raw_co_is_allocated() as follows: 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() 3. Else pretend there are no holes Later on, raw_co_is_allocated() was generalized to raw_co_get_block_status(). Commit 4f11aa8 (May 2014) changed it to try the three methods in order until success, because "there may be implementations which support [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice versa." Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. Commit 38c4d0a (Sep 2014) added it. Because that's a significant speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. As you see, the obvious use of FIEMAP is wrong, and the correct use is slow. I guess this puts it somewhere between -7 "The obvious use is wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard to Misuse scale[*]. "Fortunately", the FIEMAP code is used only when * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it was introduced for ext4 and btrfs) and 2012. * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails Unlikely. Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for bugs. Worse, bugs hiding there can theoretically bite even on a host that has SEEK_HOLE/SEEK_DATA. I don't want to worry about this crap, not even theoretically. Get rid of it. [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18raw-posix: Fix comment for raw_co_get_block_status()Markus Armbruster1-3/+1
Missed in commit 705be72. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-17target-arm: handle address translations that start at level 3Peter Maydell1-9/+11
The ARMv8 address translation system defines that a page table walk starts at a level which depends on the translation granule size and the number of bits of virtual address that need to be resolved. Where the translation granule is 64KB and the guest sets the TCR.TxSZ field to between 35 and 39, it's actually possible to start at level 3 (the final level). QEMU's implementation failed to handle this case, and so we would set level to 2 and behave incorrectly (including invoking the C undefined behaviour of shifting left by a negative number). Correct the code that determines the starting level to deal with the start-at-3 case, by replacing the if-else ladder with an expression derived from the ARM ARM pseudocode version. This error was detected by the Coverity scan, which spotted the potential shift by a negative number. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1415890569-7454-1-git-send-email-peter.maydell@linaro.org
2014-11-17Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell11-24/+41
A smattering of fixes for problems that Coverity reported. # gpg: Signature made Mon 17 Nov 2014 17:03:25 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: hcd-musb: fix dereference null return value target-cris/translate.c: fix out of bounds read shpc: fix error propaagation qemu-char: fix MISSING_COMMA acl: fix memory leak nvme: remove superfluous check loader: fix NEGATIVE_RETURNS qga: fix false negative argument passing mips_mipssim: fix use-after-free for filename l2tpv3: fix fd leak l2tpv3: fix possible double free libcacard: fix resource leak Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-17hcd-musb: fix dereference null return valuePaolo Bonzini1-2/+6
usb_ep_get and usb_handle_packet can deal with a NULL device, but we have to avoid dereferencing NULL pointers when building the id. Thanks to Gonglei for an initial stab at fixing this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' ↵Peter Maydell4-0/+0
into staging Update OpenBIOS images # gpg: Signature made Sat 15 Nov 2014 13:12:02 GMT using RSA key ID AE0F321F # gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" * remotes/mcayland/tags/qemu-openbios-signed: Update OpenBIOS images Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-17target-cris/translate.c: fix out of bounds readzhanghailiang1-6/+2
In function t_gen_mov_TN_preg and t_gen_mov_preg_TN, The begin check about the validity of in-parameter 'r' is useless. We still access cpu_PR[r] in the follow code if it is invalid. Which will be an out-of-bounds read error. Fix it by using assert() to ensure it is valid before using it. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17shpc: fix error propaagationGonglei1-1/+2
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17qemu-char: fix MISSING_COMMAGonglei1-1/+1
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17acl: fix memory leakGonglei1-5/+5
If 'i != index' for all acl->entries, variable entry leaks the storage it points to. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17nvme: remove superfluous checkGonglei1-2/+1
Operands don't affect result (CONSTANT_EXPRESSION_RESULT) ((n->bar.aqa >> AQA_ASQS_SHIFT) & AQA_ASQS_MASK) > 4095 is always false regardless of the values of its operands. This occurs as the logical second operand of '||'. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17loader: fix NEGATIVE_RETURNSGonglei1-0/+13
lseek will return -1 on error, g_malloc0(size) and read(,,size) paramenters cannot be negative. We should add a check for return value of lseek(). Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17qga: fix false negative argument passingGonglei1-2/+2
Function send_response(s, &qdict->base) returns a negative number when any failures occured. But strerror()'s parameter cannot be negative. Let's change the testing condition and pass '-ret' to strerr(). Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17mips_mipssim: fix use-after-free for filenameGonglei1-1/+1
May pass freed pointer filename as an argument to error_report. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-17l2tpv3: fix fd leakGonglei1-2/+2
In this false branch, fd will leak when it is zero. Change the testing condition. Signed-off-by: Gonglei <arei.gonglei@huawei.com> [Fix net_l2tpv3_cleanup as well. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-15Update OpenBIOS imagesMark Cave-Ayland4-0/+0
Update OpenBIOS images to SVN r1327 built from submodule. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2014-11-14Merge remote-tracking branch 'remotes/sstabellini/xen-2014-11-14' into stagingPeter Maydell3-17/+70
* remotes/sstabellini/xen-2014-11-14: xen_disk: fix unmapping of persistent grants pc: piix4_pm: init legacy PCI hotplug when running on Xen Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-14l2tpv3: fix possible double freezhanghailiang1-1/+0
freeaddrinfo(result) does not assign result = NULL, after frees it. There will be a double free when it goes error case. It is reported by covertiy. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Cc: qemu-stable@nongnu.org Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14libcacard: fix resource leakzhanghailiang1-1/+6
In function connect_to_qemu(), getaddrinfo() will allocate memory that is stored into server, it should be freed by using freeaddrinfo() before connect_to_qemu() return. Cc: qemu-stable@nongnu.org Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell8-143/+214
staging # gpg: Signature made Fri 14 Nov 2014 11:05:54 GMT using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/block-pull-request: vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info block: Fix max nb_sectors in bdrv_make_zero ahci: factor out FIS decomposition from handle_cmd ahci: Check cmd_fis[1] more explicitly ahci: Reorder error cases in handle_cmd ahci: Fix FIS decomposition ahci: add is_ncq predicate helper ide: Correct handling of malformed/short PRDTs ahci: unify sglist preparation ide: repair PIO transfers for cases where nsector > 1 ahci: Fix byte count regression for ATAPI/PIO Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-14xen_disk: fix unmapping of persistent grantsRoger Pau Monne1-6/+66
This patch fixes two issues with persistent grants and the disk PV backend (Qdisk): - Keep track of memory regions where persistent grants have been mapped since we need to unmap them as a whole. It is not possible to unmap a single grant if it has been batch-mapped. A new check has also been added to make sure persistent grants are only used if the whole mapped region can be persistently mapped in the batch_maps case. - Unmap persistent grants before switching to the closed state, so the frontend can also free them. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: George Dunlap <george.dunlap@eu.citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-11-14pc: piix4_pm: init legacy PCI hotplug when running on XenIgor Mammedov2-11/+4
If user starts QEMU with "-machine pc,accel=xen", then compat property in xenfv won't work and it would cause error: "Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set" when PCI device is added with -device on QEMU CLI. From: Igor Mammedov <imammedo@redhat.com> In case of Xen instead of using compat property, just use the fact that xen doesn't use QEMU's fw_cfg/acpi tables to switch piix4_pm into legacy PCI hotplug mode when Xen is enabled. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Li Liang <liang.z.li@intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-14vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_infoFam Zheng1-7/+13
When extent types don't match, we return -ENOTSUP. In this case, be polite to the caller and don't modify bdi. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1415938161-16217-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14block: Fix max nb_sectors in bdrv_make_zeroFam Zheng1-2/+2
In bdrv_rw_co we report -EINVAL for nb_sectors > INT_MAX / BDRV_SECTOR_SIZE, so a caller shouldn't exceed it. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1415603264-21497-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: factor out FIS decomposition from handle_cmdJohn Snow1-83/+86
In order to make handle_cmd more readable at the macro level, the details of how to decompose particular types of FIS packets are left to helper functions. In our case, the only type of FIS packet we currently expect to see is a Register H2D FIS packet, but the gory details of its decomposition are of no particular interest in handle_cmd. This patch keeps the receipt of FIS packets and the decomposition thereof separated to two different functions. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1415058979-16604-6-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: Check cmd_fis[1] more explicitlyJohn Snow1-11/+12
Instead of checking for a known byte, inspect the fields of this byte explicitly to produce more meaningful error messages and improve the readability of this section. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1415058979-16604-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: Reorder error cases in handle_cmdJohn Snow1-15/+14
Error checking in ahci's handle_cmd is re-ordered so that we initialize as few things as possible before we've done our sanity checking. This simplifies returning from this call in case of an error. A check to make sure the DMA memory map succeeds with the correct size is also added, and the debug print of the command fis is cleaned up with its size corrected. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1415058979-16604-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: Fix FIS decompositionJohn Snow1-38/+26
This patch introduces a few changes to how FIS packets are deciphered in the AHCI virtual device. The summary of changes can be grouped into two pieces: [A] Changes to how we apply a preliminary sieve to FISes, [B] Changes in how we internalize a decomposed FIS. == Changes to how we apply a preliminary sieve to FISes == (1) Packets may now either update the Control register or the Command register, but not both. This is according to the SATA 3.2 specification which states: "...the device either initiates processing of the command indicated in the Command register or initiates processing of the control request indicated [...] depending on the state of the C bit in the FIS." See SATA 3.2 section 10.5.5.4, "Reception" in the 10.5.5 "Register Host to Device FIS" section. This change accounts for the first two regions of change within the diff. All other changes belong to the following changes. == Changes in how we internalize a decomposed FIS == (2) Instead of trying to extract the sector number out of the FIS from bytes 4-10 and setting it with ide_set_sector, we set the appropriate IDEState registers and trust that ide_get_sector can retrieve the correct sector later. By "constructing" the sector for use with ide_set_sector, we are duplicating the mechanisms of ide_get_sector. This change makes the FIS decomposition more obvious. SATA 3.2 as a specification does not make the legacy register mapping with respect to the D2H FIS obvious. However, SATA 3.2 section 10.5.5.1 "Register Host to Device FIS layout" describes all of the "cmd_fis" bytes: 0 - FIS Type (0x27) 1 - Port Multiplier Port and Command Update flag 2 - ATA Command 3 - Features_Low 4 - LBA 7:0 5 - LBA 15:8 6 - LBA 23:16 7 - Device, AKA "Drive Select." 8 - LBA 31:24 9 - LBA 39:32 10 - LBA 47:40 11 - Features_High 12 - Count Low 13 - Count High 14 - ICC 15 - Control 16-19 - Auxiliary (for NCQ, defined per-command) Most of these registers map to existing IDEState registers in obvious ways, especially features, select, hob_features, and nsector (count). ICC is reserved in older specifications but is not supported in our implementation, and remains unused here. The Control register is not valid for a command that is trying to update the command register and is to be considered reserved at this point. What is not obvious is the LBA register mappings, but SATA 1.0 can help inform of us legacy device support, see SATA 1.0 section 8.5.2 "Register - Host to Device." LBA 7:0 - Sector Number (sector) LBA 15:8 - Cyl Low (lcyl) LBA 23:16 - Cyl High (hcyl) LBA 31:24 - Sector Num Exp. (hob_sector) LBA 39:32 - Cyl Low Exp. (hob_lcyl) LBA 47:40 - Cyl High Exp. (hob_hcyl) These mappings help guide which registers the FIS should be decomposed into/towards for CHS, LBA28 and LBA48 commands. As a note: The prior confusion that can be seen in the documentation arises from the fact that CHS and LBA28 commands use the low nybble of the drive select register to store LBA 27:24, whereas LNA48 commands use the hob_sector, hob_lcyl and hob_hcyl registers as explained above. The decomposition as it stands now will correctly decompose CHS, LBA28 and LBA48 commands into their appropriate registers where the core IDE/ATAPI layers can deal with them correctly. See the below point for more information. (3) We save cmd_fis[7] as ide_state->select, which informs decisions about if we are using LBA or CHS. This corrects a bug in AHCI wherein we attempt to set and/or retrieve the sector number by using ide_set_sector and ide_get_sector, which depend on the select register to determine if we are using LBA or CHS. Without this adjustment, LBA48 read/writes are currently broken. Thanks to Eniac Zheng @ HP for pointing this out. (4) Save cmd_fis[11] as ide_state->hob_feature, as defined in SATA 3.2. (5) For several ATA commands, the sector count register set to 0 is a magic number that means 256 sectors. For LBA48 commands, this means 65,536 sectors. We drop the magic sector correction here, and trust the ide core layer to handle the conversion appropriately, in ide_cmd_lba48_transform(). As it stands, the current AHCI code is only compliant with LBA28 commands. By simply removing the magic, it will work with LBA28 and LBA48. (6) We expand FIS decomposition to include both ATAPI and IDE devices. We leave the logic of determining if the fields are valid or not to the respective layers. This change intends to make it clearer that AHCI is only a composition mechanism for the FIS packets: the meanings of the registers is best left to the implementation layers for those devices. (7) Forcefully setting the feature, hcyl and lcyl registers for ATAPI commands is removed. - The hcyl and lcyl magic present here is valid at boot only, and should not be overridden for every PACKET command. - The feature register is defined as valid for the PACKET command, so we should not suppress it. The ATAPI layer does not even currently depend on or require 0x01 as mandatory. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1415058979-16604-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: add is_ncq predicate helperJohn Snow2-4/+27
A small helper to determine which S/ATA commands are destined to be routed to the NCQ pathways. This references SATA 3.2 section 13.6, Native Command Queueing. See sections 13.6.4, 13.6.5, 13.6.6, 13.6.7 and 13.6.8 for all SATA commands considered to be part of the NCQ feature set. This is summarized in a small list in section 13.6.3.1 and again in 13.6.3.2. Not all of these NCQ commands are currently supported, so the error pathways are adjusted slightly to be more informative in the case they are encountered. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1415058979-16604-2-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ide: Correct handling of malformed/short PRDTsJohn Snow5-22/+68
This impacts both BMDMA and AHCI HBA interfaces for IDE. Currently, we confuse the difference between a PRDT having "0 bytes" and a PRDT having "0 complete sectors." When we receive an incomplete sector, inconsistent error checking leads to an infinite loop wherein the call succeeds, but it didn't give us enough bytes -- leading us to re-call the DMA chain over and over again. This leads to, in the BMDMA case, leaked memory for short PRDTs, and infinite loops and resource usage in the AHCI case. The .prepare_buf() callback is reworked to return the number of bytes that it successfully prepared. 0 is a valid, non-error answer that means the table was empty and described no bytes. -1 indicates an error. Our current implementation uses the io_buffer in IDEState to ultimately describe the size of a prepared scatter-gather list. Even though the AHCI PRDT/SGList can be as large as 256GiB, the AHCI command header limits transactions to just 4GiB. ATA8-ACS3, however, defines the largest transaction to be an LBA48 command that transfers 65,536 sectors. With a 512 byte sector size, this is just 32MiB. Since our current state structures use the int type to describe the size of the buffer, and this state is migrated as int32, we are limited to describing 2GiB buffer sizes unless we change the migration protocol. For this reason, this patch begins to unify the assertions in the IDE pathways that the scatter-gather list provided by either the AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum, 2GiB. This should be resilient enough unless we need a sector size that exceeds 32KiB. Further, the likelihood of any guest operating system actually attempting to transfer this much data in a single operation is very slim. To this end, the IDEState variables have been updated to more explicitly clarify our maximum supported size. Callers to the prepare_buf callback have been reworked to understand the new return code, and all versions of the prepare_buf callback have been adjusted accordingly. Lastly, the ahci_populate_sglist helper, relied upon by the AHCI implementation of .prepare_buf() as well as the PCI implementation of the callback have had overflow assertions added to help make clear the reasonings behind the various type changes. [Added %d -> %"PRId64" fix John sent because off_pos changed from int to int64_t. --Stefan] Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: unify sglist preparationJohn Snow1-2/+2
The intent of this patch is to further unify the creation and deletion of the sglist used for all AHCI transfers, including emulated PIO, ATAPI R/W, and native DMA R/W. By replacing ahci_start_transfer's call to ahci_populate_sglist with ahci_dma_prepare_buf, we reduce the number of direct calls where we manipulate the scatter-gather list in the AHCI code. To make this switch, the constant "0" passed as an offset in ahci_dma_prepare_buf is adjusted to use io_buffer_offset. For DMA pathways, this has no effect: io_buffer_offset is always updated to 0 at the beginning of a DMA transfer loop regardless. DMA pathways through ide_dma_cb() update the io_buffer_offset accordingly, and for circumstances where we might make several trips through this loop, this may actually correct a design flaw. For PIO pathways, the newly updated ahci_dma_prepare_buf will now prepare the sglist at the correct offset. It will also set io_buffer_size, but this is not used in the cmd_read_pio or cmd_write_pio pathways. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1414785819-26209-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ide: repair PIO transfers for cases where nsector > 1John Snow2-1/+5
Currently, for emulated PIO transfers through the AHCI device, any attempt made to request more than a single sector's worth of data will result in the same sector being transferred over and over. For example, if we request 8 sectors via PIO READ SECTORS, the AHCI device will give us the same sector eight times. This patch adds offset tracking into the PIO pathways so that we can fulfill these requests appropriately. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1414785819-26209-2-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14ahci: Fix byte count regression for ATAPI/PIOJohn Snow1-0/+1
This patch fixes a regression caused by commit 659142ecf71a0da240ab0ff7cf929ee25c32b9bc. The problem occurs when we wish to return early from the ahci_start_transfer function, but are now updating the transferred byte count in the AHCI command header via ahci_commit_buf. This will cause problems in the Windows 8 installer. Don't update the byte count in the command header for the transmission of ATAPI packets: These commands will distort the final byte count of the actual data payload. The call to ahci_commit_buf remains in the "out" portion of the call in order to clean up the sglist. The byte count is maintained by forcing size to be 0. Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-13Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell13-26/+109
x86 and SCSI fixes. I left out the APIC device model patches, pending confirmation from the submitter that they really fix QNX. # gpg: Signature made Thu 13 Nov 2014 15:13:38 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: acpi: accurate overflow check smbios: change 'ram_addr_t' variables to 'uint64_t' kvmclock: Add comment explaining why we need cpu_clean_all_dirty() target-i386: fix Coverity complaints about overflows apic_common: migrate missing fields target-i386: eliminate dead code and hoist common code out of "if" virtio-scsi: Fix comment for VirtIOSCSIReq virtio-scsi: dataplane: suppress guest notification esp: Do not overwrite ESP_TCHI after reset virtio-scsi: dataplane: fix allocation for 'cmd_vrings' esp: fix coding standards virtio-scsi: work around bug in old BIOSes esp-pci: fixup deadlock with linux Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-13acpi: accurate overflow checkPavel Dovgalyuk1-2/+5
Compare clock in ns, because acpi_pm_tmr_update uses rounded to ns value instead of ticks. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> [This lets Windows boot in icount mode. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13smbios: change 'ram_addr_t' variables to 'uint64_t'SeokYeon Hwang1-5/+5
ram_addr_t should not be used except if referring to a RAMBlobk. Using 'uint64_t' avoids a -Wconstant-conversion warning, which clang >= 3.4 produces in "smbios_get_tables()". Signed-off-by: SeokYeon Hwang <syeon.hwang@samsung.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13kvmclock: Add comment explaining why we need cpu_clean_all_dirty()Eduardo Habkost1-0/+14
Try to explain why commit 317b0a6d8ba44e9bf8f9c3dbd776c4536843d82c needed a cpu_clean_all_dirty() call just after calling cpu_synchronize_all_states(). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Cc: Andrey Korolyov <andrey@xdel.ru> Cc: Marcin Gibuła <m.gibula@beyond.pl> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>