diff options
author | Jonathan Wakely <jwakely@redhat.com> | 2021-12-14 14:32:35 +0000 |
---|---|---|
committer | Jonathan Wakely <jwakely@redhat.com> | 2021-12-14 21:45:46 +0000 |
commit | 7ce3c230edf6e498e125c805a6dd313bf87dc439 (patch) | |
tree | 7f81e7a88d1d39c5a870451c49a3711e601a692e /gcc | |
parent | fda28722703d7ab8903ce5f616e3efed1bbdbc25 (diff) | |
download | gcc-7ce3c230edf6e498e125c805a6dd313bf87dc439.zip gcc-7ce3c230edf6e498e125c805a6dd313bf87dc439.tar.gz gcc-7ce3c230edf6e498e125c805a6dd313bf87dc439.tar.bz2 |
libstdc++: Fix handling of invalid ranges in std::regex [PR102447]
std::regex currently allows invalid bracket ranges such as [\w-a] which
are only allowed by ECMAScript when in web browser compatibility mode.
It should be an error, because the start of the range is a character
class, not a single character. The current implementation of
_Compiler::_M_expression_term does not provide a way to reject this,
because we only remember a previous character, not whether we just
processed a character class (or collating symbol etc.)
This patch replaces the pair<bool, CharT> used to emulate
optional<CharT> with a custom class closer to pair<tribool,CharT>. That
allows us to track three states, so that we can tell when we've just
seen a character class.
With this additional state the code in _M_expression_term for processing
the _S_token_bracket_dash can be improved to correctly reject the [\w-a]
case, without regressing for valid cases such as [\w-] and [----].
libstdc++-v3/ChangeLog:
PR libstdc++/102447
* include/bits/regex_compiler.h (_Compiler::_BracketState): New
class.
(_Compiler::_BrackeyMatcher): New alias template.
(_Compiler::_M_expression_term): Change pair<bool, CharT>
parameter to _BracketState. Process first character for
ECMAScript syntax as well as POSIX.
* include/bits/regex_compiler.tcc
(_Compiler::_M_insert_bracket_matcher): Pass _BracketState.
(_Compiler::_M_expression_term): Use _BracketState to store
state between calls. Improve handling of dashes in ranges.
* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
Add more tests for ranges containing dashes. Check invalid
ranges with character class at the beginning.
Diffstat (limited to 'gcc')
0 files changed, 0 insertions, 0 deletions