aboutsummaryrefslogtreecommitdiff
path: root/json-lexer.c
AgeCommit message (Collapse)AuthorFilesLines
2011-06-07json-lexer: make lexer error-recovery more deterministicMichael Roth1-4/+21
Currently when we reach an error state we effectively flush everything fed to the lexer, which can put us in a state where we keep feeding tokens into the parser at arbitrary offsets in the stream. This makes it difficult for the lexer/tokenizer/parser to get back in sync when bad input is made by the client. With these changes we emit an error state/token up to the tokenizer as soon as we reach an error state, and continue processing any data passed in rather than bailing out. The reset token will be used to reset the tokenizer and parser, such that they'll recover state as soon as the lexer begins generating valid token sequences again. We also map chr(192,193,245-255) to an error state here, since they are invalid UTF-8 characters. QMP guest proxy/agent will use chr(255) to force a flush/reset of previous input for reliable delivery of certain events, so also we document that thoroughly here. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07json-lexer: fix flushing logic to not always go to error stateMichael Roth1-3/+3
Currently we flush the lexer by passing in a NULL character. This generally forces the lexer to go to the corresponding TERMINAL() state for whatever token type it is currently parsing, emits the token to the parser, then puts the lexer back into IN_START state. However, since a NULL character causes char_consumed to be 0, we always do a second pass after this, which puts us in the IN_ERROR state. Fix this behavior by adding a "flush" flag that tells the lexer not to do a more than 1 iteration. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07json-lexer: reset the lexer state on an invalid tokenAnthony Liguori1-0/+3
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07json-lexer: limit the maximum size of a given tokenAnthony Liguori1-0/+13
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-04-15json-lexer: fix conflict with mingw32 ERROR definitionBlue Swirl1-3/+3
The name ERROR is too generic, it conflicts with mingw32 ERROR definition. Replace ERROR with IN_ERROR. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-06-11remove unnecessary lookaheadsPaolo Bonzini1-32/+16
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11implement optional lookahead in json lexerPaolo Bonzini1-23/+35
Not requiring one extra character when lookahead is not necessary ensures that clients behave properly even if they, for example, send QMP requests without a trailing newline. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11json-lexer: Drop 'buf'Luiz Capitulino1-6/+1
QString supports adding a single char, 'buf' is unneeded. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11json-lexer: Handle missing escapesLuiz Capitulino1-0/+4
The JSON escape sequence "\/" and "\\" are valid and should be handled. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11json-lexer: Initialize 'x' and 'y'Luiz Capitulino1-0/+1
The 'lexer' variable is passed by the caller, it can contain anything (eg. garbage). Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-02-10json: fix PRId64 on Win32Roy Tam1-0/+16
OK we are fooled by the json lexer and parser. As we use %I64d to print 'long long' variables in Win32, but lexer and parser only deal with %lld but not %I64d, this patch add support for %I64d and solve 'info pci', 'powser_reset' and 'power_powerdown' assert failure in Win32. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17Add a lexer for JSONAnthony Liguori1-0/+327
Our JSON parser is a three stage parser. The first stage tokenizes the stream into a set of lexical tokens. Since the lexical grammar is regular, we can use a finite state machine to model it. The state machine will emit tokens as they are identified. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>