aboutsummaryrefslogtreecommitdiff
path: root/nbd/trace-events
AgeCommit message (Collapse)AuthorFilesLines
2023-10-05nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUSEric Blake1-0/+1
Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-10-05nbd/client: Initial support for extended headersEric Blake1-1/+2
Update the client code to be able to send an extended request, and parse an extended header from the server. Note that since we reject any structured reply with a too-large payload, we can always normalize a valid header back into the compact form, so that the caller need not deal with two branches of a union. Still, until a later patch lets the client negotiate extended headers, the code added here should not be reached. Note that because of the different magic numbers, it is just as easy to trace and then tolerate a non-compliant server sending the wrong header reply as it would be to insist that the server is compliant. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-21-eblake@redhat.com> [eblake: fix trace format] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-10-05nbd/server: Support a request payloadEric Blake1-0/+1
Upcoming additions to support NBD 64-bit effect lengths allow for the possibility to distinguish between payload length (capped at 32M) and effect length (64 bits, although we generally assume 63 bits because of off_t limitations). Without that extension, only the NBD_CMD_WRITE request has a payload; but with the extension, it makes sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length in a future patch (where the payload is a limited-size struct that in turn gives the real effect length as well as a subset of known ids for which status is requested). Other future NBD commands may also have a request payload, so the 64-bit extension introduces a new NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header length is a payload length or an effect length, rather than hard-coding the decision based on the command. According to the spec, a client should never send a command with a payload without the negotiation phase proving such extension is available. So in the unlikely event the bit is set or cleared incorrectly, the client is already at fault; if the client then provides the payload, we can gracefully consume it off the wire and fail the command with NBD_EINVAL (subsequent checks for magic numbers ensure we are still in sync), while if the client fails to send payload we block waiting for it (basically deadlocking our connection to the bad client, but not negatively impacting our ability to service other clients, so not a security risk). Note that we do not support the payload version of BLOCK_STATUS yet. This patch also fixes a latent bug introduced in b2578459: once request->len can be 64 bits, assigning it to a 32-bit payload_len can cause wraparound to 0 which then sets req->complete prematurely; thankfully, the bug was not possible back then (it takes this and later patches to even allow request->len larger than 32 bits; and since previously the only 'payload_len = request->len' assignment was in NBD_CMD_WRITE which also sets check_length, which in turn rejects lengths larger than 32M before relying on any possibly-truncated value stored in payload_len). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-15-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> [eblake: enhance comment on handling client error, fix type bug] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25nbd: Prepare for 64-bit request effect lengthsEric Blake1-7/+7
Widen the length field of NBDRequest to 64-bits, although we can assert that all current uses are still under 32 bits: either because of NBD_MAX_BUFFER_SIZE which is even smaller (and where size_t can still be appropriate, even on 32-bit platforms), or because nothing ever puts us into NBD_MODE_EXTENDED yet (and while future patches will allow larger transactions, the lengths in play here are still capped at 32-bit). There are no semantic changes, other than a typo fix in a couple of error messages. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230829175826.377251-23-eblake@redhat.com> [eblake: fix assertion bug in nbd_co_send_simple_reply] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-07-19nbd: s/handle/cookie/ to match NBD specEric Blake1-11/+11
Externally, libnbd exposed the 64-bit opaque marker for each client NBD packet as the "cookie", because it was less confusing when contrasted with 'struct nbd_handle *' holding all libnbd state. It also avoids confusion between the noun 'handle' as a way to identify a packet and the verb 'handle' for reacting to things like signals. Upstream NBD changed their spec to favor the name "cookie" based on libnbd's recommendations[1], so we can do likewise. [1] https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230608135653.2918540-6-eblake@redhat.com> [eblake: typo fix] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-07-19nbd/server: Prepare for alternate-size headersEric Blake1-4/+4
Upstream NBD now documents[1] an extension that supports 64-bit effect lengths in requests. As part of that extension, the size of the reply headers will change in order to permit a 64-bit length in the reply for symmetry[2]. Additionally, where the reply header is currently 16 bytes for simple reply, and 20 bytes for structured reply; with the extension enabled, there will only be one extended reply header, of 32 bytes, with both structured and extended modes sending identical payloads for chunked replies. Since we are already wired up to use iovecs, it is easiest to allow for this change in header size by splitting each structured reply across multiple iovecs, one for the header (which will become wider in a future patch according to client negotiation), and the other(s) for the chunk payload, and removing the header from the payload struct definitions. Rename the affected functions with s/structured/chunk/ to make it obvious that the code will be reused in extended mode. Interestingly, the client side code never utilized the packed types, so only the server code needs to be updated. [1] https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md as of NBD commit e6f3b94a934 [2] Note that on the surface, this is because some future server might permit a 4G+ NBD_CMD_READ and need to reply with that much data in one transaction. But even though the extended reply length is widened to 64 bits, for now the NBD spec is clear that servers will not reply with more than a maximum payload bounded by the 32-bit NBD_INFO_BLOCK_SIZE field; allowing a client and server to mutually agree to transactions larger than 4G would require yet another extension. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230608135653.2918540-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2022-06-29nbd: trace long NBD operationsDenis V. Lunev1-0/+3
At the moment there are 2 sources of lengthy operations if configured: * open connection, which could retry inside and * reconnect of already opened connection These operations could be quite lengthy and cumbersome to catch thus it would be quite natural to add trace points for them. This patch is based on the original downstream work made by Vladimir. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Eric Blake <eblake@redhat.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2021-06-02docs: fix references to docs/devel/tracing.rstStefano Garzarella1-1/+1
Commit e50caf4a5c ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2019-09-18trace: Remove trailing newline in eventsPhilippe Mathieu-Daudé1-2/+2
While the tracing framework does not forbid trailing newline in events format string, using them lead to confuse output. It is the responsibility of the backend to properly end an event line. Some of our formats have trailing newlines, remove them. [Fixed typo in commit description reported by Eric Blake <eblake@redhat.com> --Stefan] Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190916095121.29506-2-philmd@redhat.com Message-Id: <20190916095121.29506-2-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-09-05nbd: Tolerate more errors to structured reply requestEric Blake1-1/+1
A server may have a reason to reject a request for structured replies, beyond just not recognizing them as a valid request; similarly, it may have a reason for rejecting a request for a meta context. It doesn't hurt us to continue talking to such a server; otherwise 'qemu-nbd --list' of such a server fails to display all available details about the export. Encountered when temporarily tweaking nbdkit to reply with NBD_REP_ERR_POLICY. Present since structured reply support was first added (commit d795299b reused starttls handling, but starttls is different in that we can't fall back to other behavior on any error). Note that for an unencrypted client trying to connect to a server that requires encryption, this defers the point of failure to when we finally execute a strict command (such as NBD_OPT_GO or NBD_OPT_LIST), now that the intermediate NBD_OPT_STRUCTURED_REPLY does not diagnose NBD_REP_ERR_TLS_REQD as fatal; but as the protocol eventually gets us to a command where we can't continue onwards, the changed error message doesn't cause any security concerns. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190824172813.29720-3-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [eblake: fix iotest 233]
2019-04-08nbd/server: Trace client noncompliance on unaligned requestsEric Blake1-0/+1
We've recently added traces for clients to flag server non-compliance; let's do the same for servers to flag client non-compliance. According to the spec, if the client requests NBD_INFO_BLOCK_SIZE, it is promising to send all requests aligned to those boundaries. Of course, if the client does not request NBD_INFO_BLOCK_SIZE, then it made no promises so we shouldn't flag anything; and because we are willing to handle clients that made no promises (the spec allows us to use NBD_REP_ERR_BLOCK_SIZE_REQD if we had been unwilling), we already have to handle unaligned requests (which the block layer already does on our behalf). So even though the spec allows us to return EINVAL for clients that promised to behave, it's easier to always answer unaligned requests. Still, flagging non-compliance can be useful in debugging a client that is trying to be maximally portable. Qemu as client used to have one spot where it sent non-compliant requests: if the server sends an unaligned reply to NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire disk, the next request would start at that unaligned point; this was fixed in commit a39286dd when the client was taught to work around server non-compliance; but is equally fixed if the server is patched to not send unaligned replies in the first place (yes, qemu 4.0 as server still has few such bugs, although they will be patched in 4.1). Fortunately, I did not find any more spots where qemu as client was non-compliant. I was able to test the patch by using the following hack to convince qemu-io to run various unaligned commands, coupled with serving 512-byte alignment by intentionally omitting '-f raw' on the server while viewing server traces. | diff --git i/nbd/client.c w/nbd/client.c | index 427980bdd22..1858b2aac35 100644 | --- i/nbd/client.c | +++ w/nbd/client.c | @@ -449,6 +449,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt, | nbd_send_opt_abort(ioc); | return -1; | } | + info->min_block = 1;//hack | if (!is_power_of_2(info->min_block)) { | error_setg(errp, "server minimum block size %" PRIu32 | " is not a power of two", info->min_block); Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190403030526.12258-3-eblake@redhat.com> [eblake: address minor review nits] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-22trace-events: Delete unused trace pointsMarkus Armbruster1-2/+0
Tracked down with cleanup-trace-events.pl. Funnies requiring manual post-processing: * block.c and blockdev.c trace points are in block/trace-events. * hw/block/nvme.c uses the preprocessor to hide its trace point use from cleanup-trace-events.pl. * include/hw/xen/xen_common.h trace points are in hw/xen/trace-events. * net/colo-compare and net/filter-rewriter.c use pseudo trace points colo_compare_udp_miscompare and colo_filter_rewriter_debug to guard debug code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190314180929.27722-5-armbru@redhat.com Message-Id: <20190314180929.27722-5-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22trace-events: Shorten file names in commentsMarkus Armbruster1-3/+3
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22trace-events: Consistently point to docs/devel/tracing.txtMarkus Armbruster1-0/+2
Almost all trace-events point to docs/devel/tracing.txt in a comment right at the beginning. Touch up the ones that don't. [Updated with Markus' new commit description wording. --Stefan] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-2-armbru@redhat.com Message-Id: <20190314180929.27722-2-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-21nbd/client: Refactor nbd_opt_go() to support NBD_OPT_INFOEric Blake1-4/+4
Rename the function to nbd_opt_info_or_go() with an added parameter and slight changes to comments and trace messages, in order to reuse the function for NBD_OPT_INFO. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190117193658.16413-17-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-01-21nbd/client: Split handshake into two functionsEric Blake1-1/+1
An upcoming patch will add the ability for qemu-nbd to list the services provided by an NBD server. Share the common code of the TLS handshake by splitting the initial exchange into a separate function, leaving only the export handling in the original function. Functionally, there should be no change in behavior in this patch, although some of the code motion may be difficult to follow due to indentation changes (view with 'git diff -w' for a smaller changeset). I considered an enum for the return code coordinating state between the two functions, but in the end just settled with ample comments. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190117193658.16413-15-eblake@redhat.com>
2019-01-21nbd/client: Split out nbd_receive_one_meta_context()Eric Blake1-1/+1
Extract portions of nbd_negotiate_simple_meta_context() to a new function nbd_receive_one_meta_context() that copies the pattern of nbd_receive_list() for performing the argument validation of one reply. The error message when the server replies with more than one context changes slightly, but that shouldn't happen in the common case. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190117193658.16413-13-eblake@redhat.com>
2019-01-21nbd/client: Split out nbd_send_meta_query()Eric Blake1-1/+1
Refactor nbd_negotiate_simple_meta_context() to pull out the code that can be reused to send a LIST request for 0 or 1 query. No semantic change. The old comment about 'sizeof(uint32_t)' being equivalent to '/* number of queries */' is no longer needed, now that we are computing 'sizeof(queries)' instead. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20190117193658.16413-12-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-01-21nbd/client: Move export name into NBDExportInfoEric Blake1-1/+1
Refactor the 'name' parameter of nbd_receive_negotiate() from being a separate parameter into being part of the in-out 'info'. This also spills over to a simplification of nbd_opt_go(). The main driver for this refactoring is that an upcoming patch would like to add support to qemu-nbd to list information about all exports available on a server, where the name(s) will be provided by the server instead of the client. But another benefit is that we can now allow the client to explicitly specify the empty export name "" even when connecting to an oldstyle server (even if qemu is no longer such a server after commit 7f7dfe2a). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190117193658.16413-10-eblake@redhat.com>
2019-01-21nbd/client: Refactor nbd_receive_list()Eric Blake1-0/+1
Right now, nbd_receive_list() is only called by nbd_receive_query_exports(), which in turn is only called if the server lacks NBD_OPT_GO but has working option negotiation, and is merely used as a quality-of-implementation trick since servers can't give decent errors for NBD_OPT_EXPORT_NAME. However, servers that lack NBD_OPT_GO are becoming increasingly rare (nbdkit was a latecomer, in Aug 2018, but qemu has been such a server since commit f37708f6 in July 2017 and released in 2.10), so it no longer makes sense to micro-optimize that function for performance. Furthermore, when debugging a server's implementation, tracing the full reply (both names and descriptions) is useful, not to mention that upcoming patches adding 'qemu-nbd --list' will want to collect that data. And when you consider that a server can send an export name up to the NBD protocol length limit of 4k; but our current NBD_MAX_NAME_SIZE is only 256, we can't trace all valid server names without more storage, but 4k is large enough that the heap is better than the stack for long names. Thus, I'm changing the division of labor, with nbd_receive_list() now always malloc'ing a result on success (the malloc is bounded by the fact that we reject servers with a reply length larger than 32M), and moving the comparison to 'wantname' to the caller. There is a minor change in behavior where a server with 0 exports (an immediate NBD_REP_ACK reply) is now no longer distinguished from a server without LIST support (NBD_REP_ERR_UNSUP); this information could be preserved with a complication to the calling contract to provide a bit more information, but I didn't see the point. After all, the worst that can happen if our guess at a match is wrong is that the caller will get a cryptic disconnect when NBD_OPT_EXPORT_NAME fails (which is no different from what would happen if we had not tried LIST), while treating an empty list as immediate failure would prevent connecting to really old servers that really did lack LIST. Besides, NBD servers with 0 exports are rare (qemu can do it when using QMP nbd-server-start without nbd-server-add - but qemu understands NBD_OPT_GO and thus won't tickle this change in behavior). Fix the spelling of foundExport to match coding standards while in the area. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190117193658.16413-9-eblake@redhat.com>
2019-01-04nbd/client: Trace all server option error messagesEric Blake1-0/+1
Not all servers send free-form text alongside option error replies, but for servers that do (such as qemu), we pass the server's message as a hint alongside our own error reporting. However, it would also be useful to trace such server messages, since we can't guarantee how the hint may be consumed. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20181218225714.284495-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-06-21nbd/server: implement dirty bitmap exportVladimir Sementsov-Ogievskiy1-0/+1
Handle a new NBD meta namespace: "qemu", and corresponding queries: "qemu:dirty-bitmap:<export bitmap name>". With the new metadata context negotiated, BLOCK_STATUS query will reply with dirty-bitmap data, converted to extents. The new public function nbd_export_bitmap selects which bitmap to export. For now, only one bitmap may be exported. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180609151758.17343-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: wording tweaks, minor cleanups, additional tracing] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-04-02nbd: trace meta context negotiationEric Blake1-0/+6
Having a more detailed log of the interaction between client and server is invaluable in debugging how meta context negotiation actually works. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180330130950.1931229-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-03-01nbd/client: fix error messages in nbd_handle_reply_errVladimir Sementsov-Ogievskiy1-4/+4
1. NBD_REP_ERR_INVALID is not only about length, so, make message more general 2. hex format is not very good: it's hard to read something like "option a (set meta context)", so switch to dec. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1518702707-7077-6-git-send-email-vsementsov@virtuozzo.com> [eblake: expand scope of patch: ALL uses of nbd_opt_lookup and nbd_rep_lookup are now decimal] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-08nbd/server: Implement sparse reads atop structured replyEric Blake1-0/+1
The reason that NBD added structured reply in the first place was to allow for efficient reads of sparse files, by allowing the reply to include chunks to quickly communicate holes to the client without sending lots of zeroes over the wire. Time to implement this in the server; our client can already read such data. We can only skip holes insofar as the block layer can query them; and only if the client is okay with a fragmented request (if a client requests NBD_CMD_FLAG_DF and the entire read is a hole, we could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but that's a fringe case not worth catering to here). Sadly, the control flow is a bit wonkier than I would have preferred, but it was minimally invasive to have a split in the action between a fragmented read (handled directly where we recognize NBD_CMD_READ with the right conditions, and sending multiple chunks) vs. a single read (handled at the end of nbd_trip, for both simple and structured replies, when we know there is only one thing being read). Likewise, I didn't make any effort to optimize the final chunk of a fragmented read to set the NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate NBD_REPLY_TYPE_NONE. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171107030912.23930-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-09nbd/server: Fix structured read of length 0Eric Blake1-0/+1
The NBD spec was recently clarified to state that a read of length 0 should not be attempted by a compliant client; but that a server must still handle it correctly in an unspecified manner (that is, either a successful no-op or an error reply, but not a crash) [1]. However, it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero payload length, but our existing code was replying with a chunk that a picky client could reject as invalid because it was missing a payload (our own client implementation was recently patched to be that picky, after first fixing it to not send 0-length requests). We are already doing successful no-ops for 0-length writes and for non-structured reads; so for consistency, we want structured reply reads to also be a no-op. The easiest way to do this is to return a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper function (especially since future patches for other structured replies may benefit from using the same helper). [1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-8-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-09nbd/client: Nicer trace of structured replyEric Blake1-1/+1
It's useful to know which structured reply chunk is being processed. Missed in commit d2febedb. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-10-30nbd/client: prepare nbd_receive_reply for structured replyVladimir Sementsov-Ogievskiy1-1/+2
In following patch nbd_receive_reply will be used both for simple and structured reply header receiving. NBDReply is altered into union of simple reply header and structured reply chunk header, simple error translation moved to block/nbd-client to be consistent with further structured reply error translation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-11-eblake@redhat.com>
2017-10-30nbd/client: refactor nbd_receive_starttlsVladimir Sementsov-Ogievskiy1-3/+1
Split out nbd_request_simple_option to be reused for structured reply option. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-10-eblake@redhat.com>
2017-10-30nbd/server: Include human-readable message in structured errorsEric Blake1-1/+1
The NBD spec permits including a human-readable error string if structured replies are in force, so we might as well send the client the message that we logged on any error. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20171027104037.8319-9-eblake@redhat.com>
2017-10-30nbd: Minimal structured read for serverVladimir Sementsov-Ogievskiy1-0/+2
Minimal implementation of structured read: one structured reply chunk, no segmentation. Minimal structured error implementation: no text message. Support DF flag, but just ignore it, as there is no segmentation any way. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-8-eblake@redhat.com>
2017-10-30nbd: Move nbd_errno_to_system_errno() to public headerEric Blake1-1/+3
This is needed in preparation for structured reply handling, as we will be performing the translation from NBD error to system errno value higher in the stack at block/nbd-client.c. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20171027104037.8319-3-eblake@redhat.com>
2017-10-30nbd: Include error names in trace messagesEric Blake1-2/+2
NBD errors were originally sent over the wire based on Linux errno values; but not all the world is Linux, and not all platforms share the same values. Since a number isn't very easy to decipher on all platforms, update the trace messages to include the name of NBD errors being sent/received over the wire. Tweak the trace messages to be at the point where we are using the NBD error, not the translation to the host errno values. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20171027104037.8319-2-eblake@redhat.com>
2017-10-12nbd/server: structurize simple reply header sendingVladimir Sementsov-Ogievskiy1-1/+0
Use packed structure instead of pointer arithmetics. Also, merge two redundant traces into one. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20171012095319.136610-5-vsementsov@virtuozzo.com> [eblake: tweak and mention impact on traces, fix errp usage] Signed-off-by: Eric Blake <eblake@redhat.com>
2017-10-12nbd: rename some simple-request related objects to be _simple_Vladimir Sementsov-Ogievskiy1-1/+1
To be consistent when their _structured_ analogs will be introduced. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20171012095319.136610-4-vsementsov@virtuozzo.com> [eblake: also tweak trace message contents] Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-01trace-events: fix code style: print 0x before hex numbersVladimir Sementsov-Ogievskiy1-9/+9
The only exception are groups of numers separated by symbols '.', ' ', ':', '/', like 'ab.09.7d'. This patch is made by the following: > find . -name trace-events | xargs python script.py where script.py is the following python script: ========================= #!/usr/bin/env python import sys import re import fileinput rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)' rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')') rbad = re.compile('(?<!0x)' + rhex) files = sys.argv[1:] for fname in files: for line in fileinput.input(fname, inplace=True): arr = re.split(rgroup, line) for i in range(0, len(arr), 2): arr[i] = re.sub(rbad, '0x\g<0>', arr[i]) sys.stdout.write(''.join(arr)) ========================= Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17nbd: Trace client command being sentEric Blake1-1/+1
Make the client trace slightly more legible by including the name of the command being sent. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-Id: <20170717192635.17880-2-eblake@redhat.com>
2017-07-14nbd: Implement NBD_INFO_BLOCK_SIZE on clientEric Blake1-0/+1
The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Implement NBD_INFO_BLOCK_SIZE on serverEric Blake1-0/+1
The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server that it intends to obey block sizes. Thanks to a recent fix (commit df7b97ff), our real minimum transfer size is always 1 (the block layer takes care of read-modify-write on our behalf), but we're still more efficient if we advertise 512 when the client supports it, as follows: - OPT_INFO, but no NBD_INFO_BLOCK_SIZE: advertise 512, then fail with NBD_REP_ERR_BLOCK_SIZE_REQD; client is free to try something else since we don't disconnect - OPT_INFO with NBD_INFO_BLOCK_SIZE: advertise 512 - OPT_GO, but no NBD_INFO_BLOCK_SIZE: advertise 1 - OPT_GO with NBD_INFO_BLOCK_SIZE: advertise 512 We can also advertise the optimum block size (presumably the cluster size, when exporting a qcow2 file), and our absolute maximum transfer size of 32M, to help newer clients avoid EINVAL failures or abrupt disconnects on oversize requests. We do not reject clients for using the older NBD_OPT_EXPORT_NAME; we are no worse off for those clients than we used to be. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-9-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Implement NBD_OPT_GO on clientEric Blake1-0/+3
NBD_OPT_EXPORT_NAME is lousy: per the NBD protocol, any failure requires the server to close the connection rather than report an error to us. Therefore, upstream NBD recently added NBD_OPT_GO as the improved version of the option that does what we want [1]: it reports sane errors on failures, and on success provides at least as much info as NBD_OPT_EXPORT_NAME. [1] https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md This is a first cut at use of the information types. Note that we do not need to use NBD_OPT_INFO, and that use of NBD_OPT_GO means we no longer have to use NBD_OPT_LIST to learn whether a server requires TLS (this requires servers that gracefully handle unknown NBD_OPT, many servers prior to qemu 2.5 were buggy, but I have patched qemu, upstream nbd, and nbdkit in the meantime, in part because of interoperability testing with this patch). We still fall back to NBD_OPT_LIST when NBD_OPT_GO is not supported on the server, as it is still one last chance for a nicer error message. Later patches will use further info, like NBD_INFO_BLOCK_SIZE. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-8-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Implement NBD_OPT_GO on serverEric Blake1-0/+3
NBD_OPT_EXPORT_NAME is lousy: per the NBD protocol, any failure requires us to close the connection rather than report an error. Therefore, upstream NBD recently added NBD_OPT_GO as the improved version of the option that does what we want [1], along with NBD_OPT_INFO that returns the same information but does not transition to transmission phase. [1] https://github.com/NetworkBlockDevice/nbd/blob/extension-info/doc/proto.md This is a first cut at the information types, and only passes the same information already available through NBD_OPT_LIST and NBD_OPT_EXPORT_NAME; items like NBD_INFO_BLOCK_SIZE (and thus any use of NBD_REP_ERR_BLOCK_SIZE_REQD) are intentionally left for later patches. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-7-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Simplify trace of client flags in negotiationEric Blake1-3/+1
Simplify the tracing of client flags in the server, and return -EINVAL instead of -EIO if we successfully read but don't like those flags. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-5-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Expose and debug more NBD constantsEric Blake1-6/+6
The NBD protocol has several constants defined in various extensions that we are about to implement. Expose them to the code, along with an easy way to map various constants to strings during diagnostic messages. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-4-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14nbd: Don't bother tracing an NBD_OPT_ABORT response failureEric Blake1-1/+0
We really don't care if our spec-compliant reply to NBD_OPT_ABORT was received, so shave off some lines of code by not even tracing it. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-3-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-10nbd: use generic trace subsystem instead of TRACE macroVladimir Sementsov-Ogievskiy1-0/+56
Let NBD use the trace mechanisms already present in qemu. Now you can use the -trace optino of qemu, or the -T/--trace option of qemu-img, qemu-io, and qemu-nbd, to select nbd traces. For qemu, the QMP commands trace-event-{get,set}-state can also toggle tracing on the fly. Example: qemu-nbd --trace 'nbd_*' <image file> # enables all nbd traces Recompilation with CFLAGS=-DDEBUG_NBD is no more needed, furthermore, DEBUG_NBD macro is removed from the code. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170707152918.23086-11-vsementsov@virtuozzo.com> [eblake: minor tweaks to a couple of traces] Signed-off-by: Eric Blake <eblake@redhat.com>