aboutsummaryrefslogtreecommitdiff
path: root/jimregexp.c
AgeCommit message (Collapse)AuthorFilesLines
2023-04-19regexp: fix incorrect check for invalid escape sequence at end of charsetSteve Bennett1-1/+1
Fixes #259 Signed-off-by: Steve Bennett <steveb@workware.net.au>
2023-04-19regexp: fix check for termination in [[:class:]]Steve Bennett1-0/+4
Fixes #259 Signed-off-by: Steve Bennett <steveb@workware.net.au>
2022-12-03regexp: fix end of word checkSteve Bennett1-1/+1
The end of word check was wrong and return true when it should not. Fixes #246 Signed-off-by: Steve Bennett <steveb@workware.net.au>
2020-07-31jimregexp: rename local regex functionsSteve Bennett1-4/+4
Avoid possible problems with when linking by renaming local regex to jim_regcomp, jim_regexec, etc. Fixes: #163 Signed-off-by: Steve Bennett <steveb@workware.net.au>
2020-05-04regexp: Improved error messageSteve Bennett1-3/+17
Detect and produce an error for missing closing bracket ] Consider a trailing backslash as an invalid escape Signed-off-by: Steve Bennett <steveb@workware.net.au>
2019-12-30regexp: Reset scanner position on failed optional groupSteve Bennett1-0/+2
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2019-11-01regexp,regsub: utf8: Fix incorrect count with . matchesSteve Bennett1-2/+1
Internally bytes were being counted rather than characters Reported-by: dbohdan <dbohdan@dbohdan.com> Signed-off-by: Steve Bennett <steveb@workware.net.au>
2017-12-31regexp: Implement class shorthand escapes in bracketsSteve Bennett1-11/+29
The following class shorthand escapes now match Tcl when used within bracket expressions: \d [[:digit:]] \s [[:space:]] \w [[:alnum:]_] (note underscore) e.g. [a-f\d] => [a-f0-9] Previously these shorthand escapes were only implemented outside bracket expressions. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2017-05-12regexp: Fix bad memory access on missing close braceSteve Bennett1-0/+4
For counted repetitions Reported-by: Ryan Whitworth <me@ryanwhitworth.com> Signed-off-by: Steve Bennett <steveb@workware.net.au>
2016-08-29Fix some minor compiler warnings.Steve Bennett1-1/+1
Mostly from -Wshadow Signed-off-by: Steve Bennett <steveb@workware.net.au>
2016-02-02regexp: Add missing support for character classesSteve Bennett1-18/+64
[[::blank:]], [[::xdigit::]], etc. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2016-02-02regexp: add partial support for \A \Z matchingSteve Bennett1-5/+28
Still not 100% correct, for example when used with regsub -all Signed-off-by: Steve Bennett <steveb@workware.net.au>
2016-02-02regexp: add support for \D, \W and \SSteve Bennett1-5/+9
These are the negated versions of \d, \w and \s Signed-off-by: Steve Bennett <steveb@workware.net.au>
2014-04-23jimregexp: remove dead codeSteve Bennett1-2/+1
Courtesy of coverity Signed-off-by: Steve Bennett <steveb@workware.net.au>
2014-04-23jimregexp: missing break for \U handlingSteve Bennett1-0/+1
Courtesy of coverity Signed-off-by: Steve Bennett <steveb@workware.net.au>
2014-01-21many comment changes, some small code changesSteve Bennett1-3/+4
Sweep through and clean up all (most) of the comments in the code. While there, adjust some variable and function names to be more consistent, and make a few small code changes - again, mostly for consistency. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2014-01-15jimregexp: code simplifications and doc cleanupsSteve Bennett1-94/+87
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2013-11-11regexp: fix utf8_setunicode -> utf8_getcharsSteve Bennett1-4/+4
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2013-11-06Fix [string tolower] buffer overflow for non-utf8Steve Bennett1-4/+4
Reported-by: Andy <jimdevel@hummypkg.org.uk> Signed-off-by: Steve Bennett <steveb@workware.net.au>
2012-01-19Fix some warnings identified by iccSteve Bennett1-2/+1
The Intel C Compiler Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-12-08Fix a regexp infinite loop on bad utf-8 inputSteve Bennett1-1/+4
regsub {\mdnl\M.*$} "word \xA9 another word" "" line Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-12-02Add support for \U with up to 8 hex digitsSteve Bennett1-0/+4
TIP #388 compatibility Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-12-02Extend UTF-8 support past the BMPSteve Bennett1-2/+13
Now codepoints up to U+1FFFFF are supported, including as literals with the new \u{NNNNNN} syntax (up to six hex digits) Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-11-10regex: counts were not all being clearedSteve Bennett1-6/+25
If a cached regex containing counts was reused, the result may have been incorrect. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-11-10regex: support - as the last element of a char setSteve Bennett1-1/+1
e.g. {[a-z-]} For Tcl ARE compatibility Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-11-10regex: add support for non-capturing parenthesesSteve Bennett1-3/+20
Tcl-compatible syntax: (?:...) Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-09-12Remove all trailing whitespace in sourceSteve Bennett1-7/+7
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-09-12Trim the size of the boostrap jimsh sourceSteve Bennett1-3/+2
By removing comments and some large blocks of unnecessary code Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-07-07Minor code cleanupsSteve Bennett1-8/+2
Some unused variables Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-29Change the builtin regexp to avoid compiling twiceSteve Bennett1-281/+238
Simply guess the program size and realloc if needed. This also fixes a compile warning on some platforms. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-28Fix builtin regexp for memory overwriteSteve Bennett1-3/+5
Reported-By: Spencer Oliver <spen@spen-soft.co.uk> Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-09Revert regexp nested repeats from b34ab2f895Steve Bennett1-5/+6
Nested repeats can't really be handled properly, so remove support since it breaks some non-nested cases. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-07Simplify/fix repeating matchesSteve Bennett1-297/+228
Simplifies *, + and {n,m}, fixes some broken cases and adds support for {n,m}? Also fixes end-of-word match Under some circumstances, repeats can now be nested. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-05Fix simple * and + case for utf-8Steve Bennett1-5/+7
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-04Fix utf8 char matching in character rangesSteve Bennett1-41/+42
Also searching the initial part of the string Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-03Add non-greedy regexp supportSteve Bennett1-28/+147
Support +?, *? and ?? Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-06-03Add make-bootstrap-jim scriptSteve Bennett1-11/+11
Allows a single source file version of jimsh to be created for bootstrap purposes. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2011-04-08Fix some minor warnings on mingw32Steve Bennett1-3/+3
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-28Bug fix: regexp should not treat \n as |Steve Bennett1-4/+3
Remove a "feature" in the built-in regexp, where a newline in the pattern was treated as alternation, like |. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-18built-in regexp was always being includedSteve Bennett1-0/+1
Even if disabled, the built-in regexp was still being used. Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17Bug fix: [regexp] single braced count was rejectedSteve Bennett1-5/+10
The form {n} should be considered the same as {n,n} Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17Minor cleanupsSteve Bennett1-1/+0
Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17Fix a regexec() bugSteve Bennett1-2/+2
An anchored search could use the wrong string Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17Update documentation to cover UTF-8 support for regexpSteve Bennett1-5/+6
Also create README.utf-8 Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17Add UTF-8 support to regexpSteve Bennett1-251/+621
Plus various ARE enhancements and bug fixes Signed-off-by: Steve Bennett <steveb@workware.net.au>
2010-11-17POSIX-compatible regex interfaceSteve Bennett1-0/+1375
With some ARE extensions Signed-off-by: Steve Bennett <steveb@workware.net.au>