Update bundled PCRE2-library to version 10.23
Some manual changes done to the library were lost with this update. They will be added in the next commit.
This commit is contained in:
13
pcre2/testdata/grepinput
vendored
13
pcre2/testdata/grepinput
vendored
@ -604,6 +604,19 @@ AB.VE the turtle
|
||||
|
||||
010203040506
|
||||
|
||||
match 1:
|
||||
a
|
||||
match 2:
|
||||
b
|
||||
match 3:
|
||||
c
|
||||
match 4:
|
||||
d
|
||||
match 5:
|
||||
e
|
||||
Rhubarb
|
||||
Custard Tart
|
||||
|
||||
PUT NEW DATA ABOVE THIS LINE.
|
||||
=============================
|
||||
|
||||
|
172
pcre2/testdata/grepoutput
vendored
172
pcre2/testdata/grepoutput
vendored
@ -10,7 +10,7 @@ RC=0
|
||||
7:PATTERN at the start of a line.
|
||||
8:In the middle of a line, PATTERN appears.
|
||||
10:This pattern is in lower case.
|
||||
610:Check up on PATTERN near the end.
|
||||
623:Check up on PATTERN near the end.
|
||||
RC=0
|
||||
---------------------------- Test 4 ------------------------------
|
||||
4
|
||||
@ -19,7 +19,7 @@ RC=0
|
||||
./testdata/grepinput:7:PATTERN at the start of a line.
|
||||
./testdata/grepinput:8:In the middle of a line, PATTERN appears.
|
||||
./testdata/grepinput:10:This pattern is in lower case.
|
||||
./testdata/grepinput:610:Check up on PATTERN near the end.
|
||||
./testdata/grepinput:623:Check up on PATTERN near the end.
|
||||
./testdata/grepinputx:3:Here is the pattern again.
|
||||
./testdata/grepinputx:5:Pattern
|
||||
./testdata/grepinputx:42:This line contains pattern not on a line by itself.
|
||||
@ -28,7 +28,7 @@ RC=0
|
||||
7:PATTERN at the start of a line.
|
||||
8:In the middle of a line, PATTERN appears.
|
||||
10:This pattern is in lower case.
|
||||
610:Check up on PATTERN near the end.
|
||||
623:Check up on PATTERN near the end.
|
||||
3:Here is the pattern again.
|
||||
5:Pattern
|
||||
42:This line contains pattern not on a line by itself.
|
||||
@ -324,10 +324,10 @@ RC=0
|
||||
./testdata/grepinput-9-
|
||||
./testdata/grepinput:10:This pattern is in lower case.
|
||||
--
|
||||
./testdata/grepinput-607-PUT NEW DATA ABOVE THIS LINE.
|
||||
./testdata/grepinput-608-=============================
|
||||
./testdata/grepinput-609-
|
||||
./testdata/grepinput:610:Check up on PATTERN near the end.
|
||||
./testdata/grepinput-620-PUT NEW DATA ABOVE THIS LINE.
|
||||
./testdata/grepinput-621-=============================
|
||||
./testdata/grepinput-622-
|
||||
./testdata/grepinput:623:Check up on PATTERN near the end.
|
||||
--
|
||||
./testdata/grepinputx-1-This is a second file of input for the pcregrep tests.
|
||||
./testdata/grepinputx-2-
|
||||
@ -349,8 +349,8 @@ RC=0
|
||||
./testdata/grepinput-12-Here follows a whole lot of stuff that makes the file over 24K long.
|
||||
./testdata/grepinput-13-
|
||||
--
|
||||
./testdata/grepinput:610:Check up on PATTERN near the end.
|
||||
./testdata/grepinput-611-This is the last line of this file.
|
||||
./testdata/grepinput:623:Check up on PATTERN near the end.
|
||||
./testdata/grepinput-624-This is the last line of this file.
|
||||
--
|
||||
./testdata/grepinputx:3:Here is the pattern again.
|
||||
./testdata/grepinputx-4-
|
||||
@ -456,8 +456,8 @@ over the lazy dog.
|
||||
This time it jumps and jumps and jumps.
|
||||
RC=0
|
||||
---------------------------- Test 52 ------------------------------
|
||||
fox [1;31mjumps[00m
|
||||
This time it [1;31mjumps[00m and [1;31mjumps[00m and [1;31mjumps[00m.
|
||||
fox [1;31mjumps[0m
|
||||
This time it [1;31mjumps[0m and [1;31mjumps[0m and [1;31mjumps[0m.
|
||||
RC=0
|
||||
---------------------------- Test 53 ------------------------------
|
||||
36972,6
|
||||
@ -474,9 +474,9 @@ RC=0
|
||||
597:32,4
|
||||
RC=0
|
||||
---------------------------- Test 55 -----------------------------
|
||||
Here is the [1;31mpattern[00m again.
|
||||
That time it was on a [1;31mline by itself[00m.
|
||||
This line contains [1;31mpattern[00m not on a [1;31mline by itself[00m.
|
||||
Here is the [1;31mpattern[0m again.
|
||||
That time it was on a [1;31mline by itself[0m.
|
||||
This line contains [1;31mpattern[0m not on a [1;31mline by itself[0m.
|
||||
RC=0
|
||||
---------------------------- Test 56 -----------------------------
|
||||
./testdata/grepinput:456
|
||||
@ -588,56 +588,57 @@ RC=0
|
||||
---------------------------- Test 70 -----------------------------
|
||||
[1;31mtriple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
[00m[1;31mtriple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[0m[1;31mtriple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
[00m[1;31mtriple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[0m[1;31mtriple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
[00m[1;31mtriple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[0m[1;31mtriple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
[00mRC=0
|
||||
[0mRC=0
|
||||
---------------------------- Test 71 -----------------------------
|
||||
01
|
||||
RC=0
|
||||
---------------------------- Test 72 -----------------------------
|
||||
[1;31m01[00m0203040506
|
||||
[1;31m01[0m0203040506
|
||||
RC=0
|
||||
---------------------------- Test 73 -----------------------------
|
||||
[1;31m01[00m
|
||||
[1;31m01[0m
|
||||
RC=0
|
||||
---------------------------- Test 74 -----------------------------
|
||||
01
|
||||
02
|
||||
RC=0
|
||||
---------------------------- Test 75 -----------------------------
|
||||
[1;31m01[00m[1;31m02[00m03040506
|
||||
[1;31m01[0m[1;31m02[0m03040506
|
||||
RC=0
|
||||
---------------------------- Test 76 -----------------------------
|
||||
[1;31m01[00m
|
||||
[1;31m02[00m
|
||||
[1;31m01[0m
|
||||
[1;31m02[0m
|
||||
RC=0
|
||||
---------------------------- Test 77 -----------------------------
|
||||
01
|
||||
03
|
||||
RC=0
|
||||
---------------------------- Test 78 -----------------------------
|
||||
[1;31m01[00m02[1;31m03[00m040506
|
||||
[1;31m01[0m02[1;31m03[0m040506
|
||||
RC=0
|
||||
---------------------------- Test 79 -----------------------------
|
||||
[1;31m01[00m
|
||||
[1;31m03[00m
|
||||
[1;31m01[0m
|
||||
[1;31m03[0m
|
||||
RC=0
|
||||
---------------------------- Test 80 -----------------------------
|
||||
01
|
||||
RC=0
|
||||
---------------------------- Test 81 -----------------------------
|
||||
[1;31m01[00m0203040506
|
||||
[1;31m01[0m0203040506
|
||||
RC=0
|
||||
---------------------------- Test 82 -----------------------------
|
||||
[1;31m01[00m
|
||||
[1;31m01[0m
|
||||
RC=0
|
||||
---------------------------- Test 83 -----------------------------
|
||||
pcre2grep: line 4 of file ./testdata/grepinput3 is too long for the internal buffer
|
||||
pcre2grep: check the --buffer-size option
|
||||
pcre2grep: the maximum buffer size is 100
|
||||
pcre2grep: use the --max-buffer-size option to change it
|
||||
RC=2
|
||||
---------------------------- Test 84 -----------------------------
|
||||
testdata/grepinputv:fox jumps
|
||||
@ -701,9 +702,9 @@ RC=0
|
||||
./testdata/grepinput:zerothe.
|
||||
RC=0
|
||||
---------------------------- Test 101 ------------------------------
|
||||
./testdata/grepinput:[1;31m.[00m|[1;31mzero[00m|[1;31mthe[00m|[1;31m.[00m
|
||||
./testdata/grepinput:[1;31mzero[00m|[1;31ma[00m
|
||||
./testdata/grepinput:[1;31m.[00m|[1;31mzero[00m|[1;31mthe[00m|[1;31m.[00m
|
||||
./testdata/grepinput:[1;31m.[0m|[1;31mzero[0m|[1;31mthe[0m|[1;31m.[0m
|
||||
./testdata/grepinput:[1;31mzero[0m|[1;31ma[0m
|
||||
./testdata/grepinput:[1;31m.[0m|[1;31mzero[0m|[1;31mthe[0m|[1;31m.[0m
|
||||
RC=0
|
||||
---------------------------- Test 102 -----------------------------
|
||||
2:
|
||||
@ -724,21 +725,21 @@ RC=0
|
||||
14:
|
||||
RC=0
|
||||
---------------------------- Test 105 -----------------------------
|
||||
[1;31m[00mtriple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t2_txt s1_tag s_txt p_tag p_txt o_tag
|
||||
[1;31m[00mLorem [1;31mipsum[00m dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t5_txt s1_tag s_txt p_tag p_txt o_tag
|
||||
[1;31m[00mo_txt
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
[1;31m[00m
|
||||
[1;31m[00mtriple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
triple: t1_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
triple: t2_txt s1_tag s_txt p_tag p_txt o_tag
|
||||
Lorem [1;31mipsum[0m dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
|
||||
|
||||
triple: t3_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
triple: t4_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
triple: t5_txt s1_tag s_txt p_tag p_txt o_tag
|
||||
o_txt
|
||||
|
||||
triple: t6_txt s2_tag s_txt p_tag p_txt o_tag o_txt
|
||||
|
||||
triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
RC=0
|
||||
---------------------------- Test 106 -----------------------------
|
||||
a
|
||||
@ -751,3 +752,80 @@ RC=0
|
||||
2:3,1
|
||||
2:4,1
|
||||
RC=0
|
||||
---------------------------- Test 108 ------------------------------
|
||||
RC=0
|
||||
---------------------------- Test 109 -----------------------------
|
||||
RC=0
|
||||
---------------------------- Test 110 -----------------------------
|
||||
match 1:
|
||||
a
|
||||
/1/a
|
||||
match 2:
|
||||
b
|
||||
/2/b
|
||||
match 3:
|
||||
c
|
||||
/3/c
|
||||
match 4:
|
||||
d
|
||||
/4/d
|
||||
match 5:
|
||||
e
|
||||
/5/e
|
||||
RC=0
|
||||
---------------------------- Test 111 -----------------------------
|
||||
607:0,12
|
||||
609:0,12
|
||||
611:0,12
|
||||
613:0,12
|
||||
615:0,12
|
||||
RC=0
|
||||
---------------------------- Test 112 -----------------------------
|
||||
37168,12
|
||||
37180,12
|
||||
37192,12
|
||||
37204,12
|
||||
37216,12
|
||||
RC=0
|
||||
---------------------------- Test 113 -----------------------------
|
||||
476
|
||||
RC=0
|
||||
---------------------------- Test 114 -----------------------------
|
||||
testdata/grepinput:469
|
||||
testdata/grepinput3:0
|
||||
testdata/grepinput8:0
|
||||
testdata/grepinputv:1
|
||||
testdata/grepinputx:6
|
||||
TOTAL:476
|
||||
RC=0
|
||||
---------------------------- Test 115 -----------------------------
|
||||
testdata/grepinput:469
|
||||
testdata/grepinputv:1
|
||||
testdata/grepinputx:6
|
||||
TOTAL:476
|
||||
RC=0
|
||||
---------------------------- Test 116 -----------------------------
|
||||
476
|
||||
RC=0
|
||||
---------------------------- Test 117 -----------------------------
|
||||
469
|
||||
0
|
||||
0
|
||||
1
|
||||
6
|
||||
476
|
||||
RC=0
|
||||
---------------------------- Test 118 -----------------------------
|
||||
testdata/grepinput3
|
||||
testdata/grepinput8
|
||||
RC=0
|
||||
---------------------------- Test 119 -----------------------------
|
||||
123
|
||||
456
|
||||
789
|
||||
---
|
||||
abc
|
||||
def
|
||||
xyz
|
||||
---
|
||||
RC=0
|
||||
|
8
pcre2/testdata/grepoutputC
vendored
Normal file
8
pcre2/testdata/grepoutputC
vendored
Normal file
@ -0,0 +1,8 @@
|
||||
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
|
||||
Arg1: [T] [his] [s] Arg2: |T| () () (0)
|
||||
The quick brown
|
||||
This time it jumps and jumps and jumps.
|
||||
Arg1: [qu] [qu]
|
||||
Arg1: [ t] [ t]
|
||||
The quick brown
|
||||
This time it jumps and jumps and jumps.
|
820
pcre2/testdata/testinput1
vendored
820
pcre2/testdata/testinput1
vendored
File diff suppressed because it is too large
Load Diff
155
pcre2/testdata/testinput10
vendored
155
pcre2/testdata/testinput10
vendored
@ -1,45 +1,7 @@
|
||||
# This set of tests is for UTF-8 support and Unicode property support, with
|
||||
# relevance only for the 8-bit library.
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
X\x{1234}
|
||||
X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
X\x{1234}\x{512}YZ
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
X\x{1234}
|
||||
X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
a\nb
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
** Failers
|
||||
a\x{12257}b
|
||||
# The next 4 patterns have UTF-8 errors
|
||||
|
||||
/[�]/utf
|
||||
|
||||
@ -47,7 +9,12 @@
|
||||
|
||||
/���xxx/utf
|
||||
|
||||
/��������/utf
|
||||
|
||||
# Now test subjects
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdf
|
||||
XX\xef
|
||||
XXX\xef\x80
|
||||
@ -89,11 +56,13 @@
|
||||
\xff
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
XX\xfb\x80\x80\x80\x80
|
||||
XX\xfd\x80\x80\x80\x80\x80
|
||||
XX\xf7\xbf\xbf\xbf
|
||||
|
||||
/shortutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
XX\xdf\=ph
|
||||
XX\xef\=ph
|
||||
XX\xef\x80\=ph
|
||||
@ -111,6 +80,7 @@
|
||||
\xfd\x80\x80\x80\x80\=ph
|
||||
|
||||
/anything/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xc0\x80
|
||||
XX\xc1\x8f
|
||||
XXX\xe0\x9f\x80
|
||||
@ -119,20 +89,57 @@
|
||||
\xfc\x83\x80\x80\x80\x80
|
||||
\xfe\x80\x80\x80\x80\x80
|
||||
\xff\x80\x80\x80\x80\x80
|
||||
\xf8\x88\x80\x80\x80
|
||||
\xf9\x87\x80\x80\x80
|
||||
\xfc\x84\x80\x80\x80\x80
|
||||
\xfd\x83\x80\x80\x80\x80
|
||||
\= Expect no match
|
||||
\xc3\x8f
|
||||
\xe0\xaf\x80
|
||||
\xe1\x80\x80
|
||||
\xf0\x9f\x80\x80
|
||||
\xf1\x8f\x80\x80
|
||||
\xf8\x88\x80\x80\x80
|
||||
\xf9\x87\x80\x80\x80
|
||||
\xfc\x84\x80\x80\x80\x80
|
||||
\xfd\x83\x80\x80\x80\x80
|
||||
\xf8\x88\x80\x80\x80\=no_utf_check
|
||||
\xf9\x87\x80\x80\x80\=no_utf_check
|
||||
\xfc\x84\x80\x80\x80\x80\=no_utf_check
|
||||
\xfd\x83\x80\x80\x80\x80\=no_utf_check
|
||||
|
||||
# Similar tests with offsets
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
X\xdfabcd\=offset=1
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=2
|
||||
|
||||
/(?<=x)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
X\xdfabcd\=offset=1
|
||||
X\xdfabcd\=offset=2
|
||||
X\xdfabcd\xdf\=offset=3
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=3
|
||||
|
||||
/(?<=xx)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
X\xdfabcd\=offset=1
|
||||
X\xdfabcd\=offset=2
|
||||
X\xdfabcd\=offset=3
|
||||
|
||||
/(?<=xxxx)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
X\xdfabcd\=offset=1
|
||||
X\xdfabcd\=offset=2
|
||||
X\xdfabcd\=offset=3
|
||||
X\xdfabc\xdf\=offset=6
|
||||
X\xdfabc\xdf\=offset=7
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=6
|
||||
|
||||
/\x{100}/IB,utf
|
||||
|
||||
/\x{1000}/IB,utf
|
||||
@ -167,27 +174,12 @@
|
||||
|
||||
/\x{212ab}/IB,utf
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single-byte is displayed. (In Perl it becomes a character, and you
|
||||
# can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
X\nabc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
*** Failers
|
||||
a\x{100}b
|
||||
|
||||
/[^ab\xC0-\xF0]/IB,utf
|
||||
\x{f1}
|
||||
\x{bf}
|
||||
\x{100}
|
||||
\x{1000}
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
\x{c0}
|
||||
\x{f0}
|
||||
|
||||
@ -214,7 +206,6 @@
|
||||
\x{100}
|
||||
Z\x{100}
|
||||
\x{100}Z
|
||||
*** Failers
|
||||
|
||||
/[\xff]/IB,utf
|
||||
>\x{ff}<
|
||||
@ -236,21 +227,23 @@
|
||||
# This tests the stricter UTF-8 check according to RFC 3629.
|
||||
|
||||
/X/utf
|
||||
\= Expect UTF-8 errors
|
||||
\x{d800}
|
||||
\x{d800}\=no_utf_check
|
||||
\x{da00}
|
||||
\x{da00}\=no_utf_check
|
||||
\x{dfff}
|
||||
\x{dfff}\=no_utf_check
|
||||
\x{110000}
|
||||
\x{110000}\=no_utf_check
|
||||
\x{2000000}
|
||||
\x{2000000}\=no_utf_check
|
||||
\x{7fffffff}
|
||||
\= Expect no match
|
||||
\x{d800}\=no_utf_check
|
||||
\x{da00}\=no_utf_check
|
||||
\x{dfff}\=no_utf_check
|
||||
\x{110000}\=no_utf_check
|
||||
\x{2000000}\=no_utf_check
|
||||
\x{7fffffff}\=no_utf_check
|
||||
|
||||
/(*UTF8)\x{1234}/
|
||||
abcd\x{1234}pqr
|
||||
abcd\x{1234}pqr
|
||||
|
||||
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
|
||||
|
||||
@ -290,11 +283,14 @@
|
||||
|
||||
/a+/utf
|
||||
a\x{123}aa\=offset=1
|
||||
a\x{123}aa\=offset=2
|
||||
a\x{123}aa\=offset=3
|
||||
a\x{123}aa\=offset=4
|
||||
a\x{123}aa\=offset=5
|
||||
\= Expect bad offset value
|
||||
a\x{123}aa\=offset=6
|
||||
\= Expect bad UTF-8 offset
|
||||
a\x{123}aa\=offset=2
|
||||
\= Expect no match
|
||||
a\x{123}aa\=offset=5
|
||||
|
||||
/\x{1234}+/Ii,utf
|
||||
|
||||
@ -395,7 +391,6 @@
|
||||
Z\x{100}
|
||||
\x{100}
|
||||
\x{100}Z
|
||||
*** Failers
|
||||
|
||||
/[z-\x{100}]/IB,utf
|
||||
|
||||
@ -421,7 +416,7 @@
|
||||
\x{104}
|
||||
\x{105}
|
||||
\x{109}
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{100}
|
||||
\x{10a}
|
||||
|
||||
@ -435,7 +430,7 @@
|
||||
\x{ff}
|
||||
\x{100}
|
||||
\x{101}
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{102}
|
||||
Y
|
||||
y
|
||||
@ -445,6 +440,22 @@
|
||||
/\x{3a3}B/IBi,utf
|
||||
|
||||
/abc/utf,replace=�
|
||||
abc
|
||||
abc
|
||||
|
||||
/(?<=(a)(?-1))x/I,utf
|
||||
a\x80zx\=offset=3
|
||||
|
||||
/[\W\p{Any}]/B
|
||||
abc
|
||||
123
|
||||
|
||||
/[\W\pL]/B
|
||||
abc
|
||||
\= Expect no match
|
||||
123
|
||||
|
||||
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf
|
||||
|
||||
/[\s[:^ascii:]]/B,ucp
|
||||
|
||||
# End of testinput10
|
||||
|
24
pcre2/testdata/testinput11
vendored
24
pcre2/testdata/testinput11
vendored
@ -4,11 +4,8 @@
|
||||
# different, so they have separate output files.
|
||||
|
||||
#forbid_utf
|
||||
#newline_default LF ANY ANYCRLF
|
||||
|
||||
/a\Cb/
|
||||
aXb
|
||||
a\nb
|
||||
|
||||
/[^\x{c4}]/IB
|
||||
|
||||
/\x{100}/I
|
||||
@ -343,7 +340,7 @@
|
||||
|
||||
# Non-UTF characters
|
||||
|
||||
/\C{2,3}/
|
||||
/.{2,3}/
|
||||
\x{400000}\x{400001}\x{400002}\x{400003}
|
||||
|
||||
/\x{400000}\x{800000}/IBi
|
||||
@ -354,4 +351,21 @@
|
||||
|
||||
/[\V]/IB
|
||||
|
||||
/(*THEN:\[A]{65501})/expand
|
||||
|
||||
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
|
||||
# even though this test is run when UTF is not supported.
|
||||
|
||||
/ab������z/utf8_input
|
||||
ab������z
|
||||
ab\x{7fffffff}z
|
||||
|
||||
/ab�������z/utf8_input
|
||||
ab�������z
|
||||
ab\x{ffffffff}z
|
||||
|
||||
/ab�Az/utf8_input
|
||||
ab�Az
|
||||
ab\x{80000041}z
|
||||
|
||||
# End of testinput11
|
||||
|
112
pcre2/testdata/testinput12
vendored
112
pcre2/testdata/testinput12
vendored
@ -7,49 +7,6 @@
|
||||
/abc/utf
|
||||
�]
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{11234}YZ
|
||||
X\x{11234}YZW
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
X\x{11234}\x{512}YZ
|
||||
*** Failers
|
||||
X\x{11234}
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
a\nb
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{12257}b
|
||||
a\x{12257}\x{11234}b
|
||||
** Failers
|
||||
a\x{100}b
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
|
||||
# Check maximum character size
|
||||
|
||||
/\x{ffff}/IB,utf
|
||||
@ -90,27 +47,12 @@
|
||||
|
||||
/\x{212ab}/IB,utf
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single-byte is displayed. (In Perl it becomes a character, and you
|
||||
# can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
X\nabc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
*** Failers
|
||||
a\x{100}b
|
||||
|
||||
/[^ab\xC0-\xF0]/IB,utf
|
||||
\x{f1}
|
||||
\x{bf}
|
||||
\x{100}
|
||||
\x{1000}
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
\x{c0}
|
||||
\x{f0}
|
||||
|
||||
@ -137,7 +79,6 @@
|
||||
\x{100}
|
||||
Z\x{100}
|
||||
\x{100}Z
|
||||
*** Failers
|
||||
|
||||
/[\xff]/IB,utf
|
||||
>\x{ff}<
|
||||
@ -157,18 +98,24 @@
|
||||
/^[\QĀ\E-\QŐ\E/B,utf
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
XX\x{d800}\=no_utf_check
|
||||
XX\x{da00}
|
||||
XX\x{da00}\=no_utf_check
|
||||
XX\x{dc00}
|
||||
XX\x{dc00}\=no_utf_check
|
||||
XX\x{de00}
|
||||
XX\x{de00}\=no_utf_check
|
||||
XX\x{dfff}
|
||||
XX\x{dfff}\=no_utf_check
|
||||
\= Expect UTF error
|
||||
XX\x{d800}
|
||||
XX\x{da00}
|
||||
XX\x{dc00}
|
||||
XX\x{de00}
|
||||
XX\x{dfff}
|
||||
XX\x{110000}
|
||||
XX\x{d800}\x{1234}
|
||||
\= Expect no match
|
||||
XX\x{d800}\=offset=3
|
||||
|
||||
/(?<=.)X/utf
|
||||
XX\x{d800}\=offset=3
|
||||
|
||||
/(*UTF16)\x{11234}/
|
||||
abcd\x{11234}pqr
|
||||
@ -229,7 +176,9 @@
|
||||
a\x{123}aa\=offset=1
|
||||
a\x{123}aa\=offset=2
|
||||
a\x{123}aa\=offset=3
|
||||
\= Expect no match
|
||||
a\x{123}aa\=offset=4
|
||||
\= Expect bad offset error
|
||||
a\x{123}aa\=offset=5
|
||||
a\x{123}aa\=offset=6
|
||||
|
||||
@ -250,11 +199,16 @@
|
||||
# Check bad offset
|
||||
|
||||
/a/utf
|
||||
\= Expect bad UTF-16 offset, or no match in 32-bit
|
||||
\x{10000}\=offset=1
|
||||
\x{10000}ab\=offset=1
|
||||
\= Expect 16-bit match, 32-bit no match
|
||||
\x{10000}ab\=offset=2
|
||||
\= Expect no match
|
||||
\x{10000}ab\=offset=3
|
||||
\= Expect no match in 16-bit, bad offset in 32-bit
|
||||
\x{10000}ab\=offset=4
|
||||
\= Expect bad offset
|
||||
\x{10000}ab\=offset=5
|
||||
|
||||
/���/utf
|
||||
@ -329,9 +283,6 @@
|
||||
|
||||
/\o{4200000}/utf
|
||||
|
||||
/\C/utf
|
||||
\x{110000}
|
||||
|
||||
/\x{100}*A/IB,utf
|
||||
A
|
||||
|
||||
@ -341,7 +292,6 @@
|
||||
Z\x{100}
|
||||
\x{100}
|
||||
\x{100}Z
|
||||
*** Failers
|
||||
|
||||
/[z-\x{100}]/IB,utf
|
||||
|
||||
@ -367,7 +317,7 @@
|
||||
\x{104}
|
||||
\x{105}
|
||||
\x{109}
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{100}
|
||||
\x{10a}
|
||||
|
||||
@ -381,7 +331,7 @@
|
||||
\x{ff}
|
||||
\x{100}
|
||||
\x{101}
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{102}
|
||||
Y
|
||||
y
|
||||
@ -390,4 +340,24 @@
|
||||
|
||||
/\x{3a3}B/IBi,utf
|
||||
|
||||
/./utf
|
||||
\x{110000}
|
||||
|
||||
/(*UTF)ab������z/B
|
||||
|
||||
/ab������z/utf
|
||||
|
||||
/[\W\p{Any}]/B
|
||||
abc
|
||||
123
|
||||
|
||||
/[\W\pL]/B
|
||||
abc
|
||||
\x{100}
|
||||
\x{308}
|
||||
\= Expect no match
|
||||
123
|
||||
|
||||
/[\s[:^ascii:]]/B,ucp
|
||||
|
||||
# End of testinput12
|
||||
|
139
pcre2/testdata/testinput14
vendored
139
pcre2/testdata/testinput14
vendored
@ -1,112 +1,37 @@
|
||||
# These are:
|
||||
#
|
||||
# (1) Tests of the match-limiting features. The results are different for
|
||||
# interpretive or JIT matching, so this test should not be run with JIT. The
|
||||
# same tests are run using JIT in test 16.
|
||||
# These test special (mostly error) UTF features of DFA matching. They are a
|
||||
# selection of the more comprehensive tests that are run for non-DFA matching.
|
||||
# The output is different for the different widths.
|
||||
|
||||
# (2) Other tests that must not be run with JIT.
|
||||
#subject dfa
|
||||
|
||||
/(a+)*zz/I
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
|
||||
aaaaaaaaaaaaaz\=find_limits
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
XX\x{d800}\=offset=3
|
||||
XX\x{d800}\=no_utf_check
|
||||
XX\x{da00}
|
||||
XX\x{da00}\=no_utf_check
|
||||
XX\x{dc00}
|
||||
XX\x{dc00}\=no_utf_check
|
||||
XX\x{de00}
|
||||
XX\x{de00}\=no_utf_check
|
||||
XX\x{dfff}
|
||||
XX\x{dfff}\=no_utf_check
|
||||
XX\x{110000}
|
||||
XX\x{d800}\x{1234}
|
||||
|
||||
/badutf/utf
|
||||
X\xdf
|
||||
XX\xef
|
||||
XXX\xef\x80
|
||||
X\xf7
|
||||
XX\xf7\x80
|
||||
XXX\xf7\x80\x80
|
||||
|
||||
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
||||
/* this is a C style comment */\=find_limits
|
||||
|
||||
/^(?>a)++/
|
||||
aa\=find_limits
|
||||
aaaaaaaaa\=find_limits
|
||||
|
||||
/(a)(?1)++/
|
||||
aa\=find_limits
|
||||
aaaaaaaaa\=find_limits
|
||||
|
||||
/a(?:.)*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/a(?:.(*THEN))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/a(?:.(*THEN:ABC))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/(*LIMIT_MATCH=12bc)abc/
|
||||
|
||||
/(*LIMIT_MATCH=4294967290)abc/
|
||||
|
||||
/(*LIMIT_RECURSION=4294967280)abc/I
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
|
||||
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=60000
|
||||
|
||||
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
|
||||
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
|
||||
/(*LIMIT_RECURSION=10)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=recursion_limit=1000
|
||||
|
||||
/(*LIMIT_RECURSION=10)(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
|
||||
/(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
|
||||
# These three have infinitely nested recursions.
|
||||
|
||||
/((?2))((?1))/
|
||||
abc
|
||||
|
||||
/((?(R2)a+|(?1)b))/
|
||||
aaaabcde
|
||||
|
||||
/(?(R)a*(?1)|((?R))b)/
|
||||
aaaabcde
|
||||
|
||||
# The allusedtext modifier does not work with JIT, which does not maintain
|
||||
# the leftchar/rightchar data.
|
||||
|
||||
/abc(?=xyz)/allusedtext
|
||||
abcxyzpqr
|
||||
abcxyzpqr\=aftertext
|
||||
|
||||
/(?<=pqr)abc(?=xyz)/allusedtext
|
||||
xyzpqrabcxyzpqr
|
||||
xyzpqrabcxyzpqr\=aftertext
|
||||
|
||||
/a\b/
|
||||
a.\=allusedtext
|
||||
a\=allusedtext
|
||||
|
||||
/abc\Kxyz/
|
||||
abcxyz\=allusedtext
|
||||
|
||||
/abc(?=xyz(*ACCEPT))/
|
||||
abcxyz\=allusedtext
|
||||
|
||||
/abc(?=abcde)(?=ab)/allusedtext
|
||||
abcabcdefg
|
||||
/shortutf/utf
|
||||
XX\xdf\=ph
|
||||
XX\xef\=ph
|
||||
XX\xef\x80\=ph
|
||||
\xf7\=ph
|
||||
\xf7\x80\=ph
|
||||
|
||||
# End of testinput14
|
||||
|
169
pcre2/testdata/testinput15
vendored
169
pcre2/testdata/testinput15
vendored
@ -1,9 +1,168 @@
|
||||
# This test is run only when JIT support is not available. It checks that an
|
||||
# attempt to use it has the expected behaviour. It also tests things that
|
||||
# are different without JIT.
|
||||
# These are:
|
||||
#
|
||||
# (1) Tests of the match-limiting features. The results are different for
|
||||
# interpretive or JIT matching, so this test should not be run with JIT. The
|
||||
# same tests are run using JIT in test 17.
|
||||
|
||||
/abc/I,jit,jitverify
|
||||
# (2) Other tests that must not be run with JIT.
|
||||
|
||||
/a*/I
|
||||
/(a+)*zz/I
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
|
||||
aaaaaaaaaaaaaz\=find_limits
|
||||
|
||||
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
||||
/* this is a C style comment */\=find_limits
|
||||
|
||||
/^(?>a)++/
|
||||
aa\=find_limits
|
||||
aaaaaaaaa\=find_limits
|
||||
|
||||
/(a)(?1)++/
|
||||
aa\=find_limits
|
||||
aaaaaaaaa\=find_limits
|
||||
|
||||
/a(?:.)*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/a(?:.(*THEN))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/a(?:.(*THEN:ABC))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
|
||||
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
|
||||
/(*LIMIT_MATCH=12bc)abc/
|
||||
|
||||
/(*LIMIT_MATCH=4294967290)abc/
|
||||
|
||||
/(*LIMIT_RECURSION=4294967280)abc/I
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
|
||||
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=60000
|
||||
|
||||
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
|
||||
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
|
||||
/(*LIMIT_RECURSION=10)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=recursion_limit=1000
|
||||
|
||||
/(*LIMIT_RECURSION=10)(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
|
||||
/(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
aaaaaaaaaaaaaz
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
|
||||
# These three have infinitely nested recursions.
|
||||
|
||||
/((?2))((?1))/
|
||||
abc
|
||||
|
||||
/((?(R2)a+|(?1)b))()/
|
||||
aaaabcde
|
||||
|
||||
/(?(R)a*(?1)|((?R))b)/
|
||||
aaaabcde
|
||||
|
||||
# The allusedtext modifier does not work with JIT, which does not maintain
|
||||
# the leftchar/rightchar data.
|
||||
|
||||
/abc(?=xyz)/allusedtext
|
||||
abcxyzpqr
|
||||
abcxyzpqr\=aftertext
|
||||
|
||||
/(?<=pqr)abc(?=xyz)/allusedtext
|
||||
xyzpqrabcxyzpqr
|
||||
xyzpqrabcxyzpqr\=aftertext
|
||||
|
||||
/a\b/
|
||||
a.\=allusedtext
|
||||
a\=allusedtext
|
||||
|
||||
/abc\Kxyz/
|
||||
abcxyz\=allusedtext
|
||||
|
||||
/abc(?=xyz(*ACCEPT))/
|
||||
abcxyz\=allusedtext
|
||||
|
||||
/abc(?=abcde)(?=ab)/allusedtext
|
||||
abcabcdefg
|
||||
|
||||
# These tests provoke recursion loops, which give a different error message
|
||||
# when JIT is used.
|
||||
|
||||
/(?R)/I
|
||||
abcd
|
||||
|
||||
/(a|(?R))/I
|
||||
abcd
|
||||
defg
|
||||
|
||||
/(ab|(bc|(de|(?R))))/I
|
||||
abcd
|
||||
fghi
|
||||
|
||||
/(ab|(bc|(de|(?1))))/I
|
||||
abcd
|
||||
fghi
|
||||
|
||||
/x(ab|(bc|(de|(?1)x)x)x)/I
|
||||
xab123
|
||||
xfghi
|
||||
|
||||
/(?!\w)(?R)/
|
||||
abcd
|
||||
=abc
|
||||
|
||||
/(?=\w)(?R)/
|
||||
=abc
|
||||
abcd
|
||||
|
||||
/(?<!\w)(?R)/
|
||||
abcd
|
||||
|
||||
/(?<=\w)(?R)/
|
||||
abcd
|
||||
|
||||
/(a+|(?R)b)/
|
||||
aaa
|
||||
bbb
|
||||
|
||||
/[^\xff]((?1))/BI
|
||||
abcd
|
||||
|
||||
# These tests don't behave the same with JIT
|
||||
|
||||
/\w+(?C1)/BI,no_auto_possess
|
||||
abc\=callout_fail=1
|
||||
|
||||
/(*NO_AUTO_POSSESS)\w+(?C1)/BI
|
||||
abc\=callout_fail=1
|
||||
|
||||
# This test breaks the JIT stack limit
|
||||
|
||||
/(|]+){2,2452}/
|
||||
(|]+){2,2452}
|
||||
|
||||
# End of testinput15
|
||||
|
206
pcre2/testdata/testinput16
vendored
206
pcre2/testdata/testinput16
vendored
File diff suppressed because one or more lines are too long
355
pcre2/testdata/testinput17
vendored
355
pcre2/testdata/testinput17
vendored
File diff suppressed because one or more lines are too long
117
pcre2/testdata/testinput18
vendored
117
pcre2/testdata/testinput18
vendored
@ -1,17 +1,112 @@
|
||||
# This set of tests is run only with the 8-bit library. It tests the POSIX
|
||||
# interface with UTF/UCP support, which is supported only with the 8-bit
|
||||
# library. This test should not be run with JIT (which is not available for the
|
||||
# POSIX interface).
|
||||
# interface, which is supported only with the 8-bit library. This test should
|
||||
# not be run with JIT (which is not available for the POSIX interface).
|
||||
|
||||
#forbid_utf
|
||||
#pattern posix
|
||||
|
||||
/a\x{1234}b/utf
|
||||
a\x{1234}b
|
||||
# Test invalid options
|
||||
|
||||
/\w/
|
||||
+++\x{c2}
|
||||
/abc/auto_callout
|
||||
|
||||
/\w/ucp
|
||||
+++\x{c2}
|
||||
|
||||
# End of testdata/testinput17
|
||||
/abc/
|
||||
abc\=find_limits
|
||||
|
||||
/abc/
|
||||
abc\=partial_hard
|
||||
|
||||
# Real tests
|
||||
|
||||
/abc/
|
||||
abc
|
||||
|
||||
/^abc|def/
|
||||
abcdef
|
||||
abcdef\=notbol
|
||||
|
||||
/.*((abc)$|(def))/
|
||||
defabc
|
||||
defabc\=noteol
|
||||
|
||||
/the quick brown fox/
|
||||
the quick brown fox
|
||||
\= Expect no match
|
||||
The Quick Brown Fox
|
||||
|
||||
/the quick brown fox/i
|
||||
the quick brown fox
|
||||
The Quick Brown Fox
|
||||
|
||||
/(*LF)abc.def/
|
||||
\= Expect no match
|
||||
abc\ndef
|
||||
|
||||
/(*LF)abc$/
|
||||
abc
|
||||
abc\n
|
||||
|
||||
/(abc)\2/
|
||||
|
||||
/(abc\1)/
|
||||
\= Expect no match
|
||||
abc
|
||||
|
||||
/a*(b+)(z)(z)/
|
||||
aaaabbbbzzzz
|
||||
aaaabbbbzzzz\=ovector=0
|
||||
aaaabbbbzzzz\=ovector=1
|
||||
aaaabbbbzzzz\=ovector=2
|
||||
|
||||
/(*ANY)ab.cd/
|
||||
ab-cd
|
||||
ab=cd
|
||||
\= Expect no match
|
||||
ab\ncd
|
||||
|
||||
/ab.cd/s
|
||||
ab-cd
|
||||
ab=cd
|
||||
ab\ncd
|
||||
|
||||
/a(b)c/posix_nosub
|
||||
abc
|
||||
|
||||
/a(?P<name>b)c/posix_nosub
|
||||
abc
|
||||
|
||||
/(a)\1/posix_nosub
|
||||
zaay
|
||||
|
||||
/a?|b?/
|
||||
abc
|
||||
\= Expect no match
|
||||
ddd\=notempty
|
||||
|
||||
/\w+A/
|
||||
CDAAAAB
|
||||
|
||||
/\w+A/ungreedy
|
||||
CDAAAAB
|
||||
|
||||
/\Biss\B/I,aftertext
|
||||
Mississippi
|
||||
|
||||
/abc/\
|
||||
|
||||
"(?(?C)"
|
||||
|
||||
"(?(?C))"
|
||||
|
||||
/abcd/substitute_extended
|
||||
|
||||
/\[A]{1000000}**/expand,regerror_buffsize=31
|
||||
|
||||
/\[A]{1000000}**/expand,regerror_buffsize=32
|
||||
|
||||
//posix_nosub
|
||||
\=offset=70000
|
||||
|
||||
/(?=(a\K))/
|
||||
a
|
||||
|
||||
# End of testdata/testinput18
|
||||
|
76
pcre2/testdata/testinput19
vendored
76
pcre2/testdata/testinput19
vendored
@ -1,62 +1,18 @@
|
||||
# This set of tests exercises the serialization/deserialization functions in
|
||||
# the library. It does not use UTF or JIT.
|
||||
|
||||
#forbid_utf
|
||||
|
||||
# Compile several patterns, push them onto the stack, and then write them
|
||||
# all to a file.
|
||||
|
||||
#pattern push
|
||||
|
||||
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
|
||||
(?(DEFINE)
|
||||
(?<NAME_PAT>[a-z]+)
|
||||
(?<ADDRESS_PAT>\d+)
|
||||
)/x
|
||||
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
|
||||
|
||||
#save testsaved1
|
||||
|
||||
# Do it again for some more patterns.
|
||||
|
||||
/(*MARK:A)(*SKIP:B)(C|X)/mark
|
||||
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
|
||||
|
||||
#save testsaved2
|
||||
#pattern -push
|
||||
|
||||
# Reload the patterns, then pop them one by one and check them.
|
||||
|
||||
#load testsaved1
|
||||
#load testsaved2
|
||||
|
||||
#pop info
|
||||
foofoo
|
||||
barbar
|
||||
# This set of tests is run only with the 8-bit library. It tests the POSIX
|
||||
# interface with UTF/UCP support, which is supported only with the 8-bit
|
||||
# library. This test should not be run with JIT (which is not available for the
|
||||
# POSIX interface).
|
||||
|
||||
#pop mark
|
||||
C
|
||||
D
|
||||
#pattern posix
|
||||
|
||||
/a\x{1234}b/utf
|
||||
a\x{1234}b
|
||||
|
||||
/\w/
|
||||
\= Expect no match
|
||||
+++\x{c2}
|
||||
|
||||
/\w/ucp
|
||||
+++\x{c2}
|
||||
|
||||
#pop
|
||||
AmanaplanacanalPanama
|
||||
|
||||
#pop info
|
||||
metcalfe 33
|
||||
|
||||
# Check for an error when different tables are used.
|
||||
|
||||
/abc/push,tables=1
|
||||
/xyz/push,tables=2
|
||||
#save testsaved1
|
||||
|
||||
#pop
|
||||
xyz
|
||||
|
||||
#pop
|
||||
abc
|
||||
|
||||
#pop should give an error
|
||||
pqr
|
||||
|
||||
# End of testinput19
|
||||
# End of testdata/testinput19
|
||||
|
1050
pcre2/testdata/testinput2
vendored
1050
pcre2/testdata/testinput2
vendored
File diff suppressed because it is too large
Load Diff
100
pcre2/testdata/testinput20
vendored
Normal file
100
pcre2/testdata/testinput20
vendored
Normal file
@ -0,0 +1,100 @@
|
||||
# This set of tests exercises the serialization/deserialization and code copy
|
||||
# functions in the library. It does not use UTF or JIT.
|
||||
|
||||
#forbid_utf
|
||||
|
||||
# Compile several patterns, push them onto the stack, and then write them
|
||||
# all to a file.
|
||||
|
||||
#pattern push
|
||||
|
||||
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
|
||||
(?(DEFINE)
|
||||
(?<NAME_PAT>[a-z]+)
|
||||
(?<ADDRESS_PAT>\d+)
|
||||
)/x
|
||||
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
|
||||
|
||||
#save testsaved1
|
||||
|
||||
# Do it again for some more patterns.
|
||||
|
||||
/(*MARK:A)(*SKIP:B)(C|X)/mark
|
||||
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
|
||||
|
||||
#save testsaved2
|
||||
#pattern -push
|
||||
|
||||
# Reload the patterns, then pop them one by one and check them.
|
||||
|
||||
#load testsaved1
|
||||
#load testsaved2
|
||||
|
||||
#pop info
|
||||
foofoo
|
||||
barbar
|
||||
|
||||
#pop mark
|
||||
C
|
||||
\= Expect no match
|
||||
D
|
||||
|
||||
#pop
|
||||
AmanaplanacanalPanama
|
||||
|
||||
#pop info
|
||||
metcalfe 33
|
||||
|
||||
# Check for an error when different tables are used.
|
||||
|
||||
/abc/push,tables=1
|
||||
/xyz/push,tables=2
|
||||
#save testsaved1
|
||||
|
||||
#pop
|
||||
xyz
|
||||
|
||||
#pop
|
||||
abc
|
||||
|
||||
#pop should give an error
|
||||
pqr
|
||||
|
||||
/abcd/pushcopy
|
||||
abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
|
||||
#pop should give an error
|
||||
|
||||
/abcd/push
|
||||
#popcopy
|
||||
abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
|
||||
/abcd/push
|
||||
#save testsaved1
|
||||
#pop should give an error
|
||||
|
||||
#load testsaved1
|
||||
#popcopy
|
||||
abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
|
||||
#pop should give an error
|
||||
|
||||
/abcd/pushtablescopy
|
||||
abcd
|
||||
|
||||
#popcopy
|
||||
abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
|
||||
# End of testinput20
|
16
pcre2/testdata/testinput21
vendored
Normal file
16
pcre2/testdata/testinput21
vendored
Normal file
@ -0,0 +1,16 @@
|
||||
# These are tests of \C that do not involve UTF. They are not run when \C is
|
||||
# disabled by compiling with --enable-never-backslash-C.
|
||||
|
||||
/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\R \C+\H \C+\h \C+\V \C+\v \C+\Z \C+\z \C+$/Bx
|
||||
|
||||
/\D+\C \d+\C \S+\C \s+\C \W+\C \w+\C .+\C \R+\C \H+\C \h+\C \V+\C \v+\C a+\C \n+\C \C+\C/Bx
|
||||
|
||||
/ab\Cde/never_backslash_c
|
||||
|
||||
/ab\Cde/info
|
||||
abXde
|
||||
|
||||
/(?<=ab\Cde)X/
|
||||
abZdeX
|
||||
|
||||
# End of testinput21
|
97
pcre2/testdata/testinput22
vendored
Normal file
97
pcre2/testdata/testinput22
vendored
Normal file
@ -0,0 +1,97 @@
|
||||
# Tests of \C when Unicode support is available. Note that \C is not supported
|
||||
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
|
||||
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
|
||||
# in some widths and not in others.
|
||||
|
||||
/ab\Cde/utf,info
|
||||
abXde
|
||||
|
||||
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
|
||||
# 16-bit modes, but not in 32-bit mode.
|
||||
|
||||
/(?<=ab\Cde)X/utf
|
||||
ab!deXYZ
|
||||
|
||||
# Autopossessification tests
|
||||
|
||||
/\C+\X \X+\C/Bx
|
||||
|
||||
/\C+\X \X+\C/Bx,utf
|
||||
|
||||
/\C\X*TӅ;
|
||||
{0,6}\v+
|
||||
F
|
||||
/utf
|
||||
\= Expect no match
|
||||
Ӆ\x0a
|
||||
|
||||
/\C(\W?ſ)'?{{/utf
|
||||
\= Expect no match
|
||||
\\C(\\W?ſ)'?{{
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
X\x{11234}YZ
|
||||
X\x{11234}YZW
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
X\x{1234}
|
||||
X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
X\x{1234}\x{512}YZ
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
X\x{1234}
|
||||
X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
X\x{11234}\x{512}YZ
|
||||
X\x{11234}
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
a\nb
|
||||
a\x{100}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
a\x{12257}b
|
||||
a\x{12257}\x{11234}b
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single code unit is displayed. (In Perl it becomes a character,
|
||||
# and you can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
X\nabc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
\= Expect no match in 8-bit mode
|
||||
a\x{100}b
|
||||
|
7
pcre2/testdata/testinput23
vendored
Normal file
7
pcre2/testdata/testinput23
vendored
Normal file
@ -0,0 +1,7 @@
|
||||
# This test is run when PCRE2 has been built with --enable-never-backslash-C,
|
||||
# which disables the use of \C. All we can do is check that it gives the
|
||||
# correct error message.
|
||||
|
||||
/a\Cb/
|
||||
|
||||
# End of testinput23
|
18
pcre2/testdata/testinput3
vendored
18
pcre2/testdata/testinput3
vendored
@ -8,35 +8,35 @@
|
||||
#forbid_utf
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/^[\w]+/locale=fr_FR
|
||||
�cole
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/^[\W]+/
|
||||
�cole
|
||||
|
||||
/^[\W]+/locale=fr_FR
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/[\b]/
|
||||
\b
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
a
|
||||
|
||||
/[\b]/locale=fr_FR
|
||||
\b
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
a
|
||||
|
||||
/^\w+/
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/^\w+/locale=fr_FR
|
||||
@ -46,12 +46,12 @@
|
||||
�cole
|
||||
|
||||
/(.+)\b(.+)/locale=fr_FR
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/�cole/i
|
||||
�cole
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/�cole/i,locale=fr_FR
|
||||
@ -72,7 +72,7 @@
|
||||
|
||||
/^[\xc8-\xc9]/
|
||||
�cole
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
|
||||
/\W+/
|
||||
|
553
pcre2/testdata/testinput4
vendored
553
pcre2/testdata/testinput4
vendored
File diff suppressed because it is too large
Load Diff
240
pcre2/testdata/testinput5
vendored
240
pcre2/testdata/testinput5
vendored
@ -3,6 +3,8 @@
|
||||
# results in 8-bit, 16-bit, and 32-bit modes are excluded (see tests 10 and
|
||||
# 12).
|
||||
|
||||
#newline_default lf any anycrlf
|
||||
|
||||
# PCRE2 and Perl disagree about the characteristics of certain Unicode
|
||||
# characters. For example, 061C is considered by Perl to be Arabic, though
|
||||
# is it not listed as such in the Unicode Scripts.txt file, and 2066-2069 are
|
||||
@ -11,11 +13,11 @@
|
||||
# test 4.
|
||||
|
||||
/^[\p{Arabic}]/utf
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{061c}
|
||||
|
||||
/^[[:graph:]]+$/utf,ucp
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{61c}
|
||||
\x{2066}
|
||||
\x{2067}
|
||||
@ -23,7 +25,7 @@
|
||||
\x{2069}
|
||||
|
||||
/^[[:print:]]+$/utf,ucp
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{61c}
|
||||
\x{2066}
|
||||
\x{2067}
|
||||
@ -54,6 +56,7 @@
|
||||
A\x{85}\x{2005}Z
|
||||
|
||||
/^[[:graph:]]+$/utf,ucp
|
||||
\= Expect no match
|
||||
\x{180e}
|
||||
|
||||
/^[[:print:]]+$/utf,ucp
|
||||
@ -63,6 +66,7 @@
|
||||
\x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{61c}\x{1680}\x{180e}
|
||||
|
||||
/^[[:^print:]]+$/utf,ucp
|
||||
\= Expect no match
|
||||
\x{180e}
|
||||
|
||||
# End of U+180E tests.
|
||||
@ -109,12 +113,9 @@
|
||||
/.{3,5}?/IB,utf
|
||||
\x{212ab}\x{212ab}\x{212ab}\x{861}
|
||||
|
||||
/(?<=\C)X/utf
|
||||
Should produce an error diagnostic
|
||||
|
||||
/^[ab]/IB,utf
|
||||
bar
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
c
|
||||
\x{ff}
|
||||
\x{100}
|
||||
@ -123,7 +124,7 @@
|
||||
c
|
||||
\x{ff}
|
||||
\x{100}
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
aaa
|
||||
|
||||
/\x{100}*(\d+|"(?1)")/utf
|
||||
@ -133,7 +134,7 @@
|
||||
"\x{100}1234"
|
||||
\x{100}\x{100}12ab
|
||||
\x{100}\x{100}"12"
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
\x{100}\x{100}abcd
|
||||
|
||||
/\x{100}*/IB,utf
|
||||
@ -147,7 +148,7 @@
|
||||
/[Ā-Ą]/utf
|
||||
\x{100}
|
||||
\x{104}
|
||||
*** Failers
|
||||
\= Expect no match
|
||||
\x{105}
|
||||
\x{ff}
|
||||
|
||||
@ -217,7 +218,7 @@
|
||||
a\x{85}b
|
||||
a\x{2028}b
|
||||
a\x{2029}b
|
||||
** Failers
|
||||
\= Expect no match
|
||||
a\n\rb
|
||||
|
||||
/^a\R*b/bsr=unicode,utf
|
||||
@ -240,7 +241,7 @@
|
||||
a\x{85}b
|
||||
a\n\rb
|
||||
a\n\r\x{85}\x0cb
|
||||
** Failers
|
||||
\= Expect no match
|
||||
ab
|
||||
|
||||
/^a\R{1,3}b/bsr=unicode,utf
|
||||
@ -251,34 +252,34 @@
|
||||
a\r\n\r\n\r\nb
|
||||
a\n\r\n\rb
|
||||
a\n\n\r\nb
|
||||
** Failers
|
||||
\= Expect no match
|
||||
a\n\n\n\rb
|
||||
a\r
|
||||
|
||||
/\H\h\V\v/utf
|
||||
X X\x0a
|
||||
X\x09X\x0b
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{a0} X\x0a
|
||||
|
||||
/\H*\h+\V?\v{3,4}/utf
|
||||
\x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
|
||||
\x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
|
||||
\x09\x20\x{a0}\x0a\x0b\x0c
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x09\x20\x{a0}\x0a\x0b
|
||||
|
||||
/\H\h\V\v/utf
|
||||
\x{3001}\x{3000}\x{2030}\x{2028}
|
||||
X\x{180e}X\x{85}
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x{2009} X\x0a
|
||||
|
||||
/\H*\h+\V?\v{3,4}/utf
|
||||
\x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
|
||||
\x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
|
||||
\x09\x20\x{202f}\x0a\x0b\x0c
|
||||
** Failers
|
||||
\= Expect no match
|
||||
\x09\x{200a}\x{a0}\x{2028}\x0b
|
||||
|
||||
/[\h]/B,utf
|
||||
@ -300,7 +301,7 @@
|
||||
a\rb
|
||||
a\nb
|
||||
a\r\nb
|
||||
** Failers
|
||||
\= Expect no match
|
||||
a\x{85}b
|
||||
a\x0bb
|
||||
|
||||
@ -315,7 +316,7 @@
|
||||
a\rb
|
||||
a\nb
|
||||
a\r\nb
|
||||
** Failers
|
||||
\= Expect no match
|
||||
a\x{85}b
|
||||
a\x0bb
|
||||
|
||||
@ -325,11 +326,10 @@
|
||||
a\r\nb
|
||||
a\x{85}b
|
||||
a\x0bb
|
||||
** Failers
|
||||
|
||||
/.*a.*=.b.*/utf,newline=any
|
||||
QQQ\x{2029}ABCaXYZ=!bPQR
|
||||
** Failers
|
||||
\= Expect no match
|
||||
a\x{2029}b
|
||||
\x61\xe2\x80\xa9\x62
|
||||
|
||||
@ -338,13 +338,13 @@
|
||||
/a[^]b/utf,alt_bsux,allow_empty_class,match_unset_backref
|
||||
a\x{1234}b
|
||||
a\nb
|
||||
** Failers
|
||||
\= Expect no match
|
||||
ab
|
||||
|
||||
/a[^]+b/utf,alt_bsux,allow_empty_class,match_unset_backref
|
||||
aXb
|
||||
a\nX\nX\x{1234}b
|
||||
** Failers
|
||||
\= Expect no match
|
||||
ab
|
||||
|
||||
/(\x{de})\1/
|
||||
@ -396,6 +396,7 @@
|
||||
X\x{123}\x{123}\x{123}\x{123}\=ps
|
||||
|
||||
/X\x{123}{2,4}b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
X\x{123}x\=ps
|
||||
X\x{123}\x{123}x\=ps
|
||||
@ -403,6 +404,7 @@
|
||||
X\x{123}\x{123}\x{123}\x{123}x\=ps
|
||||
|
||||
/X\x{123}{2,4}?b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
X\x{123}x\=ps
|
||||
X\x{123}\x{123}x\=ps
|
||||
@ -410,6 +412,7 @@
|
||||
X\x{123}\x{123}\x{123}\x{123}x\=ps
|
||||
|
||||
/X\x{123}{2,4}+b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
X\x{123}x\=ps
|
||||
X\x{123}\x{123}x\=ps
|
||||
@ -804,6 +807,7 @@
|
||||
/[^\x{100}]*[^\x{10000}]+[^\x{10ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{fffff}]{5,6}+/Bi,utf
|
||||
|
||||
/(?<=\x{1234}\x{1234})\bxy/I,utf
|
||||
|
||||
/(?<!^)ETA/utf
|
||||
\= Expect no match
|
||||
ETA
|
||||
@ -834,7 +838,7 @@
|
||||
|
||||
/[\p{Nd}+-]+/IB,utf
|
||||
1234
|
||||
12-34
|
||||
12-34
|
||||
12+\x{661}-34
|
||||
\= Expect no match
|
||||
abcd
|
||||
@ -901,7 +905,7 @@
|
||||
\x{2068}
|
||||
\x{2069}
|
||||
|
||||
/^\p{Cs}/utf
|
||||
/^\p{Cs}/utf
|
||||
\x{dfff}\=no_utf_check
|
||||
\= Expect no match
|
||||
\x{09f}
|
||||
@ -918,7 +922,7 @@
|
||||
\x{230a}
|
||||
|
||||
/^\p{Sc}+/utf
|
||||
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
|
||||
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
|
||||
\x{9f2}
|
||||
\= Expect no match
|
||||
X
|
||||
@ -928,7 +932,7 @@
|
||||
\ \
|
||||
\x{a0}
|
||||
\x{1680}
|
||||
\x{2000}
|
||||
\x{2000}
|
||||
\x{2001}
|
||||
\= Expect no match
|
||||
\x{2028}
|
||||
@ -937,31 +941,31 @@
|
||||
# These are here because Perl has problems with the negative versions of the
|
||||
# properties and has changed how it behaves for caseless matching.
|
||||
|
||||
/\p{^Lu}/i,utf
|
||||
/\p{^Lu}/i,utf
|
||||
1234
|
||||
\= Expect no match
|
||||
ABC
|
||||
|
||||
/\P{Lu}/i,utf
|
||||
/\P{Lu}/i,utf
|
||||
1234
|
||||
\= Expect no match
|
||||
ABC
|
||||
|
||||
/\p{Ll}/i,utf
|
||||
a
|
||||
a
|
||||
Az
|
||||
\= Expect no match
|
||||
ABC
|
||||
|
||||
/\p{Lu}/i,utf
|
||||
A
|
||||
A
|
||||
a\x{10a0}B
|
||||
\= Expect no match
|
||||
a
|
||||
\x{1d00}
|
||||
|
||||
/\p{Lu}/i,utf
|
||||
A
|
||||
A
|
||||
aZ
|
||||
\= Expect no match
|
||||
abc
|
||||
@ -1018,12 +1022,12 @@
|
||||
ABCD
|
||||
1234
|
||||
\x{6ca}
|
||||
\x{a6c}
|
||||
\x{a6c}
|
||||
\x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
|
||||
/^\p{Xan}+/utf
|
||||
/^\p{Xan}+/utf
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
\= Expect no match
|
||||
_ABC
|
||||
@ -1044,18 +1048,18 @@
|
||||
ABCD1234_
|
||||
1234abcd_
|
||||
\x{6ca}
|
||||
\x{a6c}
|
||||
\x{a6c}
|
||||
\x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
|
||||
/^[\p{Xan}]+/utf
|
||||
/^[\p{Xan}]+/utf
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
\= Expect no match
|
||||
_ABC
|
||||
|
||||
/^>\p{Xsp}/utf
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
>\x{a0}
|
||||
\= Expect no match
|
||||
\x{0b}
|
||||
@ -1082,7 +1086,7 @@
|
||||
> \x{09}\x{0a}\x{0c}\x{0d}\x{a0}\x{1680}\x{2028}\x{0b}
|
||||
|
||||
/^>\p{Xps}/utf
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
>\x{a0}
|
||||
\= Expect no match
|
||||
\x{0b}
|
||||
@ -1113,7 +1117,7 @@
|
||||
1234
|
||||
\x{6ca}
|
||||
\x{a6c}
|
||||
\x{10a7}
|
||||
\x{10a7}
|
||||
_ABC
|
||||
\= Expect no match
|
||||
[]
|
||||
@ -1138,7 +1142,7 @@
|
||||
1234abcd_
|
||||
\x{6ca}
|
||||
\x{a6c}
|
||||
\x{10a7}
|
||||
\x{10a7}
|
||||
_ABC
|
||||
\= Expect no match
|
||||
[]
|
||||
@ -1232,7 +1236,7 @@
|
||||
|
||||
# Without PCRE_UCP, non-ASCII always fail, even if < 256
|
||||
|
||||
/\b...\B/utf
|
||||
/\b...\B/utf
|
||||
abc_
|
||||
\= Expect no match
|
||||
\x{37e}abc\x{376}
|
||||
@ -1288,9 +1292,11 @@
|
||||
/A+\p{N}A+\dB+\p{N}*B+\d*/B,ucp
|
||||
|
||||
# These behaved oddly in Perl, so they are kept in this test
|
||||
|
||||
/(\x{23a}\x{23a}\x{23a})?\1/i,utf
|
||||
\= Expect no match
|
||||
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
|
||||
|
||||
/(ȺȺȺ)?\1/i,utf
|
||||
\= Expect no match
|
||||
ȺȺȺⱥⱥ
|
||||
@ -1300,9 +1306,11 @@
|
||||
|
||||
/(ȺȺȺ)?\1/i,utf
|
||||
ȺȺȺⱥⱥⱥ
|
||||
|
||||
/(\x{23a}\x{23a}\x{23a})\1/i,utf
|
||||
\= Expect no match
|
||||
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
|
||||
|
||||
/(ȺȺȺ)\1/i,utf
|
||||
\= Expect no match
|
||||
ȺȺȺⱥⱥ
|
||||
@ -1328,19 +1336,19 @@
|
||||
# These scripts weren't yet in Perl when I added Unicode 6.0.0 to PCRE
|
||||
|
||||
/^[\p{Batak}]/utf
|
||||
\x{1bc0}
|
||||
\x{1bc0}
|
||||
\x{1bff}
|
||||
\= Expect no match
|
||||
\x{1bf4}
|
||||
|
||||
/^[\p{Brahmi}]/utf
|
||||
\x{11000}
|
||||
\x{11000}
|
||||
\x{1106f}
|
||||
\= Expect no match
|
||||
\x{1104e}
|
||||
|
||||
/^[\p{Mandaic}]/utf
|
||||
\x{840}
|
||||
\x{840}
|
||||
\x{85e}
|
||||
\= Expect no match
|
||||
\x{85c}
|
||||
@ -1355,11 +1363,9 @@
|
||||
/^\X/utf
|
||||
́réo
|
||||
|
||||
/^a\X41z/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
/^a\X41z/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
aX41z
|
||||
\= Expect no match
|
||||
aAz
|
||||
|
||||
aAz
|
||||
|
||||
/\X/
|
||||
@ -1453,7 +1459,7 @@
|
||||
|
||||
/\x{3a3}+./i,utf,aftertext
|
||||
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
|
||||
|
||||
|
||||
/\x{3a3}++./i,utf,aftertext
|
||||
\= Expect no match
|
||||
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
|
||||
@ -1463,19 +1469,24 @@
|
||||
/[^\x{3a3}]*\x{3c2}/Bi,utf
|
||||
|
||||
/[^a]*\x{3c2}/Bi,utf
|
||||
|
||||
/ist/Bi,utf
|
||||
\= Expect no match
|
||||
ikt
|
||||
|
||||
/is+t/i,utf
|
||||
iSs\x{17f}t
|
||||
\= Expect no match
|
||||
ikt
|
||||
|
||||
/is+?t/i,utf
|
||||
\= Expect no match
|
||||
ikt
|
||||
|
||||
/is?t/i,utf
|
||||
\= Expect no match
|
||||
ikt
|
||||
|
||||
/is{2}t/i,utf
|
||||
\= Expect no match
|
||||
iskt
|
||||
@ -1485,52 +1496,52 @@
|
||||
/^\p{Xuc}/utf
|
||||
$abc
|
||||
@abc
|
||||
`abc
|
||||
`abc
|
||||
\x{1234}abc
|
||||
\= Expect no match
|
||||
abc
|
||||
|
||||
/^\p{Xuc}+/utf
|
||||
/^\p{Xuc}+/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\p{Xuc}+?/utf
|
||||
/^\p{Xuc}+?/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\p{Xuc}+?\*/utf
|
||||
/^\p{Xuc}+?\*/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\p{Xuc}++/utf
|
||||
/^\p{Xuc}++/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\p{Xuc}{3,5}/utf
|
||||
/^\p{Xuc}{3,5}/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\p{Xuc}{3,5}?/utf
|
||||
/^\p{Xuc}{3,5}?/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^[\p{Xuc}]/utf
|
||||
/^[\p{Xuc}]/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^[\p{Xuc}]+/utf
|
||||
/^[\p{Xuc}]+/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
|
||||
/^\P{Xuc}/utf
|
||||
/^\P{Xuc}/utf
|
||||
abc
|
||||
\= Expect no match
|
||||
$abc
|
||||
@ -1538,7 +1549,7 @@
|
||||
`abc
|
||||
\x{1234}abc
|
||||
|
||||
/^[\P{Xuc}]/utf
|
||||
/^[\P{Xuc}]/utf
|
||||
abc
|
||||
\= Expect no match
|
||||
$abc
|
||||
@ -1603,13 +1614,13 @@
|
||||
|
||||
/[\p{N}]?+/B,no_auto_possess
|
||||
|
||||
/[\p{L}ab]{2,3}+/B,no_auto_possess
|
||||
/[\p{L}ab]{2,3}+/B,no_auto_possess
|
||||
|
||||
/\D+\X \d+\X \S+\X \s+\X \W+\X \w+\X \R+\X \H+\X \h+\X \V+\X \v+\X a+\X \n+\X .+\X/Bx
|
||||
|
||||
/.+\X/Bsx
|
||||
|
||||
/\X+$/Bmx
|
||||
/\X+$/Bmx
|
||||
|
||||
/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/Bx
|
||||
|
||||
@ -1634,9 +1645,7 @@
|
||||
|
||||
/ábc/utf,replace=XሴZ
|
||||
123ábc123
|
||||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>
|
||||
123abcáyzabcdef789abcሴqr
|
||||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>
|
||||
123abcáyzabcdef789abcሴqr
|
||||
|
||||
@ -1651,4 +1660,107 @@
|
||||
|
||||
"\xa\xf<(.\pZ*\P{Xwd}+^\xa8\3'3yq.::?(?J:()\xd1+!~:3'(8?:)':(?'d'(?'d'^u]!.+.+\\A\Ah(n+?9){7}+\K;(?'X'u'(?'c'(?'z'(?<y>\xb::\xf0'|\xd3(\xae?'w(z\x8?P>l)\x8?P>a)'\H\R\xd1+!!~:3'(?:h$N{26875}\W+?\\=D{2}\x89(?i:Uy0\N({2\xa(\v\x85*){y*\A(()\p{L}+?\P{^Xan}'+?\xff\+pS\?|).{;y*\A(()\p{L}+?\8}\d?1(|)(/1){7}.+[Lp{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(\xbf(R))\x8a\X*?\x8a\xb\xd1^9\3*+(\xc1,\k'R'\xb4)\xcc(z\z(?J)(?'X'\x1b(\xb\xd1^9\?'3*+P{^Xan}+?\xff\+(\xc1.]k+\xb'Pm'\xb4)\xcc4f\xa7'\xd1V(?i:U,{2,2})'(?'X'))?-%--\x95$9*\4'|\xd1(\x9c''%\x94$9)#(?'R')3\x7?('P\xed7'\xa8\xb1^u\xeaw\1\0\0\(|(?1){7}.+[\p{Me}].\s\xdcC*^\x14?(?(<y>))(?<!^)$C((;*?(R*?))+(?(R)\x8a\X*?\x8a\xb\xd1^9\3*+|(\xc1,\k'R'\xb4)\xcc! z)\z(?JJ)(?'X';(\xb\xd1^9\?'3*+(\xc1.]k+\xb'Pm'\xb4))':(?'d')(?'RD'(d')|)|$)'|(?<x>\g{d});\g{x}\x11\g{d}\x81\|$((?'X'\'X'(?'W''\x92()'9'\x83*))\xba*\!?^ <){)':;\xcc4'\xd1'(?'X'28))?-%--\x95$9*\4'|\xd1((''e\x94*$9:)*#(?'R')3)\x7?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+0!~:(?)'d'E:yD!\s(?'R'\x1e;\x10:U))|'\x9g!\xb0*){)\\x16:;()\x1e\x10\x87*:(?<y>)\xd1+!~:(?)'}'\d'E:yD!\s(?'R'\x1e;\x10:U))|'))|)g!\xb0*R+9{29+)#(?'P'})*?pS\{3,}\x85,{0,}l{*UTF)(\xe{7}){3722,{9,}d{2,?|))|{)\(A?&d}}{\xa,}2}){3,}7,l{)22}(,}l:7{2,4}}29\x19+)#?'P'})*v?))\x5"
|
||||
|
||||
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?'X'8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?'X'))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
||||
|
||||
"(*UTF)(*UCP)(.UTF).+X(\V+;\^(\D|)!999}(?(?C{7(?C')\H*\S*/^\x5\xa\\xd3\x85n?(;\D*(?m).[^mH+((*UCP)(*U:F)})(?!^)(?'"
|
||||
|
||||
/[\pS#moq]/
|
||||
=
|
||||
|
||||
/(*:a\x{12345}b\t(d\)c)xxx/utf,alt_verbnames,mark
|
||||
cxxxz
|
||||
|
||||
/abcd/utf,replace=x\x{824}y\o{3333}z(\Q12\$34$$\x34\E5$$),substitute_extended
|
||||
abcd
|
||||
|
||||
/a(\x{e0}\x{101})(\x{c0}\x{102})/utf,replace=a\u$1\U$1\E$1\l$2\L$2\Eab\U\x{e0}\x{101}\L\x{d0}\x{160}\EDone,substitute_extended
|
||||
a\x{e0}\x{101}\x{c0}\x{102}
|
||||
|
||||
/((?<digit>\d)|(?<letter>\p{L}))/g,substitute_extended,replace=<${digit:+digit; :not digit; }${letter:+letter:not a letter}>
|
||||
ab12cde
|
||||
|
||||
/(*UCP)(*UTF)[[:>:]]X/B
|
||||
|
||||
/abc/utf,replace=xyz
|
||||
abc\=zero_terminate
|
||||
|
||||
/a[[:punct:]b]/ucp,bincode
|
||||
|
||||
/a[[:punct:]b]/utf,ucp,bincode
|
||||
|
||||
/a[b[:punct:]]/utf,ucp,bincode
|
||||
|
||||
/[[:^ascii:]]/utf,ucp,bincode
|
||||
|
||||
/[[:^ascii:]\w]/utf,ucp,bincode
|
||||
|
||||
/[\w[:^ascii:]]/utf,ucp,bincode
|
||||
|
||||
/[^[:ascii:]\W]/utf,ucp,bincode
|
||||
\x{de}
|
||||
\x{200}
|
||||
\= Expect no match
|
||||
\x{300}
|
||||
\x{37e}
|
||||
|
||||
/[[:^ascii:]a]/utf,ucp,bincode
|
||||
|
||||
/L(?#(|++<!(2)?/B,utf,no_auto_possess,auto_callout
|
||||
|
||||
/L(?#(|++<!(2)?/B,utf,ucp,auto_callout
|
||||
|
||||
/(*UTF)C\x09((?<!'(?x)!*H? #\xcc\x9a[^$]/
|
||||
|
||||
/[\D]/utf
|
||||
\x{1d7cf}
|
||||
|
||||
/[\D\P{Nd}]/utf
|
||||
\x{1d7cf}
|
||||
|
||||
/[^\D]/utf
|
||||
a9b
|
||||
\= Expect no match
|
||||
\x{1d7cf}
|
||||
|
||||
/[^\D\P{Nd}]/utf
|
||||
a9b
|
||||
\x{1d7cf}
|
||||
\= Expect no match
|
||||
\x{10000}
|
||||
|
||||
# Hex uses pattern length, not zero-terminated. This tests for overrunning
|
||||
# the given length of a pattern.
|
||||
|
||||
/'(*UTF)'/hex
|
||||
|
||||
/'#('/hex,extended,utf
|
||||
|
||||
/a(?<=A\XB)/utf
|
||||
|
||||
/ab(?<=A\RB)/utf
|
||||
|
||||
/../utf,auto_callout
|
||||
\n\x{123}\x{123}\x{123}\x{123}
|
||||
|
||||
# This tests processing wide characters in extended mode.
|
||||
|
||||
/XȀ/x,utf
|
||||
|
||||
# These three test a bug fix that was not clearing up after a locale setting
|
||||
# when the test or a subsequent one matched a wide character.
|
||||
|
||||
//locale=C
|
||||
|
||||
/[\P{Yi}]/utf
|
||||
\x{2f000}
|
||||
|
||||
/[\P{Yi}]/utf,locale=C
|
||||
\x{2f000}
|
||||
|
||||
/^(?<!(?=))/B,utf
|
||||
|
||||
# Horizontal and vertical space lists ignore caseless
|
||||
|
||||
/[\HH]/Bi,utf
|
||||
|
||||
/[^\HH]/Bi,utf
|
||||
|
628
pcre2/testdata/testinput6
vendored
628
pcre2/testdata/testinput6
vendored
File diff suppressed because it is too large
Load Diff
413
pcre2/testdata/testinput7
vendored
413
pcre2/testdata/testinput7
vendored
File diff suppressed because it is too large
Load Diff
42
pcre2/testdata/testinput8
vendored
42
pcre2/testdata/testinput8
vendored
@ -1,8 +1,11 @@
|
||||
# These are a few representative patterns whose lengths and offsets are to be
|
||||
# shown when the link size is 2. This is just a doublecheck test to ensure the
|
||||
# sizes don't go horribly wrong when something is changed. The pattern contents
|
||||
# are all themselves checked in other tests. Unicode, including property
|
||||
# support, is required for these tests.
|
||||
# There are two sorts of patterns in this test. A number of them are
|
||||
# representative patterns whose lengths and offsets are checked. This is just a
|
||||
# doublecheck test to ensure the sizes don't go horribly wrong when something
|
||||
# is changed. The operation of these patterns is checked in other tests.
|
||||
#
|
||||
# This file also contains tests whose output varies with code unit size and/or
|
||||
# link size. Unicode support is required for these tests. There are separate
|
||||
# output files for each code unit size and link size.
|
||||
|
||||
#pattern fullbincode,memory
|
||||
|
||||
@ -67,7 +70,7 @@
|
||||
/\xff/utf
|
||||
|
||||
/\x{0041}\x{2262}\x{0391}\x{002e}/I,utf
|
||||
|
||||
|
||||
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
|
||||
|
||||
/\x{65e5}\x{672c}\x{8a9e}/I,utf
|
||||
@ -150,10 +153,33 @@
|
||||
|
||||
# Check the absolute limit on nesting (?| etc. This varies with code unit
|
||||
# width because the workspace is a different number of bytes. It will fail
|
||||
# in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|
|
||||
)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
|
||||
/parens_nest_limit=1000,-fullbincode
|
||||
|
||||
# Use "expand" to create some very long patterns with nested parentheses, in
|
||||
# order to test workspace overflow. Again, this varies with code unit width,
|
||||
# and even when it fails in two modes, the error offset differs. It also varies
|
||||
# with link size - hence multiple tests with different values.
|
||||
|
||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||
|
||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||
|
||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||
|
||||
/(?(1)(?1)){8,}+()/debug
|
||||
abcd
|
||||
|
||||
/(?(1)|a(?1)b){2,}+()/debug
|
||||
abcde
|
||||
|
||||
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
|
||||
|
||||

|
||||
|
||||
fullbincode
|
||||
|
||||
# End of testinput8
|
||||
|
17
pcre2/testdata/testinput9
vendored
17
pcre2/testdata/testinput9
vendored
@ -2,11 +2,10 @@
|
||||
# UTF-8 or Unicode property support. */
|
||||
|
||||
#forbid_utf
|
||||
#newline_default lf any anycrlf
|
||||
|
||||
/a\Cb/
|
||||
aXb
|
||||
a\nb
|
||||
** Failers (too big char)
|
||||
/ab/
|
||||
\= Expect error message (too big char) and no match
|
||||
A\x{123}B
|
||||
A\o{443}B
|
||||
|
||||
@ -240,9 +239,15 @@
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
|
||||
XX
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
|
||||
XX
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
||||
XX
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
|
||||
XX
|
||||
|
||||
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
|
||||
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
@ -251,4 +256,8 @@
|
||||
|
||||
/[^\s]*\s* [^\W]+\W+ [^\d]*?\d0 [^\d\w]{4,6}?\w*A/B
|
||||
|
||||
/(*MARK:a\x{100}b)z/alt_verbnames
|
||||
|
||||
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/
|
||||
|
||||
# End of testinput9
|
||||
|
1206
pcre2/testdata/testoutput1
vendored
1206
pcre2/testdata/testoutput1
vendored
File diff suppressed because it is too large
Load Diff
265
pcre2/testdata/testoutput10
vendored
265
pcre2/testdata/testoutput10
vendored
@ -1,70 +1,10 @@
|
||||
# This set of tests is for UTF-8 support and Unicode property support, with
|
||||
# relevance only for the 8-bit library.
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}Y
|
||||
1: \x{1234}Y
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
0: X\x{1234}\x{512}
|
||||
X\x{1234}\x{512}YZ
|
||||
0: X\x{1234}\x{512}
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}
|
||||
X\x{1234}\x{512}
|
||||
0: X\x{1234}
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
** Failers
|
||||
No match
|
||||
a\x{12257}b
|
||||
No match
|
||||
# The next 4 patterns have UTF-8 errors
|
||||
|
||||
/[�]/utf
|
||||
Failed: error -8 at offset 0: UTF-8 error: byte 2 top bits not 0x80
|
||||
Failed: error -8 at offset 1: UTF-8 error: byte 2 top bits not 0x80
|
||||
|
||||
/�/utf
|
||||
Failed: error -3 at offset 0: UTF-8 error: 1 byte missing at end
|
||||
@ -72,7 +12,13 @@ Failed: error -3 at offset 0: UTF-8 error: 1 byte missing at end
|
||||
/���xxx/utf
|
||||
Failed: error -8 at offset 0: UTF-8 error: byte 2 top bits not 0x80
|
||||
|
||||
/��������/utf
|
||||
Failed: error -22 at offset 2: UTF-8 error: isolated byte with 0x80 bit set
|
||||
|
||||
# Now test subjects
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdf
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 1
|
||||
XX\xef
|
||||
@ -146,13 +92,14 @@ Failed: error -20: UTF-8 error: overlong 5-byte sequence at offset 0
|
||||
\xfc\x80\x80\x80\x80\x8f
|
||||
Failed: error -21: UTF-8 error: overlong 6-byte sequence at offset 0
|
||||
\x80
|
||||
Failed: error -22: UTF-8 error: isolated 0x80 byte at offset 0
|
||||
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 0
|
||||
\xfe
|
||||
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
|
||||
\xff
|
||||
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
XX\xfb\x80\x80\x80\x80
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 2
|
||||
XX\xfd\x80\x80\x80\x80\x80
|
||||
@ -161,6 +108,7 @@ Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at of
|
||||
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 2
|
||||
|
||||
/shortutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
XX\xdf\=ph
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
|
||||
XX\xef\=ph
|
||||
@ -193,6 +141,7 @@ Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 0
|
||||
|
||||
/anything/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xc0\x80
|
||||
Failed: error -17: UTF-8 error: overlong 2-byte sequence at offset 1
|
||||
XX\xc1\x8f
|
||||
@ -209,6 +158,15 @@ Failed: error -21: UTF-8 error: overlong 6-byte sequence at offset 0
|
||||
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
|
||||
\xff\x80\x80\x80\x80\x80
|
||||
Failed: error -23: UTF-8 error: illegal byte (0xfe or 0xff) at offset 0
|
||||
\xf8\x88\x80\x80\x80
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xf9\x87\x80\x80\x80
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xfc\x84\x80\x80\x80\x80
|
||||
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xfd\x83\x80\x80\x80\x80
|
||||
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
|
||||
\= Expect no match
|
||||
\xc3\x8f
|
||||
No match
|
||||
\xe0\xaf\x80
|
||||
@ -219,14 +177,6 @@ No match
|
||||
No match
|
||||
\xf1\x8f\x80\x80
|
||||
No match
|
||||
\xf8\x88\x80\x80\x80
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xf9\x87\x80\x80\x80
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xfc\x84\x80\x80\x80\x80
|
||||
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xfd\x83\x80\x80\x80\x80
|
||||
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
|
||||
\xf8\x88\x80\x80\x80\=no_utf_check
|
||||
No match
|
||||
\xf9\x87\x80\x80\x80\=no_utf_check
|
||||
@ -235,7 +185,62 @@ No match
|
||||
No match
|
||||
\xfd\x83\x80\x80\x80\x80\=no_utf_check
|
||||
No match
|
||||
|
||||
# Similar tests with offsets
|
||||
|
||||
/badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=1
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=2
|
||||
No match
|
||||
|
||||
/(?<=x)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=1
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=2
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\xdf\=offset=3
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 6
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=3
|
||||
No match
|
||||
|
||||
/(?<=xx)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=1
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=2
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=3
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
|
||||
/(?<=xxxx)badutf/utf
|
||||
\= Expect UTF-8 errors
|
||||
X\xdfabcd
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=1
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=2
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabcd\=offset=3
|
||||
Failed: error -8: UTF-8 error: byte 2 top bits not 0x80 at offset 1
|
||||
X\xdfabc\xdf\=offset=6
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 5
|
||||
X\xdfabc\xdf\=offset=7
|
||||
Failed: error -33: bad offset value
|
||||
\= Expect no match
|
||||
X\xdfabcd\=offset=6
|
||||
No match
|
||||
|
||||
/\x{100}/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -448,29 +453,6 @@ First code unit = \xf0
|
||||
Last code unit = \xab
|
||||
Subject length lower bound = 1
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single-byte is displayed. (In Perl it becomes a character, and you
|
||||
# can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{e1}
|
||||
2: \x{88}\x{b4}
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
*** Failers
|
||||
No match
|
||||
a\x{100}b
|
||||
No match
|
||||
|
||||
/[^ab\xC0-\xF0]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -499,8 +481,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{1000}
|
||||
0: \x{1000}
|
||||
*** Failers
|
||||
0: *
|
||||
\= Expect no match
|
||||
\x{c0}
|
||||
No match
|
||||
\x{f0}
|
||||
@ -659,8 +640,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[\xff]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -750,33 +729,35 @@ Failed: error 106 at offset 15: missing terminating ] for character class
|
||||
# This tests the stricter UTF-8 check according to RFC 3629.
|
||||
|
||||
/X/utf
|
||||
\= Expect UTF-8 errors
|
||||
\x{d800}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
|
||||
\x{d800}\=no_utf_check
|
||||
No match
|
||||
\x{da00}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
|
||||
\x{da00}\=no_utf_check
|
||||
No match
|
||||
\x{dfff}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 0
|
||||
\x{dfff}\=no_utf_check
|
||||
No match
|
||||
\x{110000}
|
||||
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 0
|
||||
\x{110000}\=no_utf_check
|
||||
No match
|
||||
\x{2000000}
|
||||
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at offset 0
|
||||
\x{2000000}\=no_utf_check
|
||||
No match
|
||||
\x{7fffffff}
|
||||
Failed: error -14: UTF-8 error: 6-byte character is not allowed (RFC 3629) at offset 0
|
||||
\= Expect no match
|
||||
\x{d800}\=no_utf_check
|
||||
No match
|
||||
\x{da00}\=no_utf_check
|
||||
No match
|
||||
\x{dfff}\=no_utf_check
|
||||
No match
|
||||
\x{110000}\=no_utf_check
|
||||
No match
|
||||
\x{2000000}\=no_utf_check
|
||||
No match
|
||||
\x{7fffffff}\=no_utf_check
|
||||
No match
|
||||
|
||||
/(*UTF8)\x{1234}/
|
||||
abcd\x{1234}pqr
|
||||
abcd\x{1234}pqr
|
||||
0: \x{1234}
|
||||
|
||||
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
|
||||
@ -887,16 +868,19 @@ Subject length lower bound = 3
|
||||
/a+/utf
|
||||
a\x{123}aa\=offset=1
|
||||
0: aa
|
||||
a\x{123}aa\=offset=2
|
||||
Error -36 (bad UTF-8 offset)
|
||||
a\x{123}aa\=offset=3
|
||||
0: aa
|
||||
a\x{123}aa\=offset=4
|
||||
0: a
|
||||
a\x{123}aa\=offset=5
|
||||
No match
|
||||
\= Expect bad offset value
|
||||
a\x{123}aa\=offset=6
|
||||
Failed: error -33: bad offset value
|
||||
\= Expect bad UTF-8 offset
|
||||
a\x{123}aa\=offset=2
|
||||
Error -36 (bad UTF-8 offset)
|
||||
\= Expect no match
|
||||
a\x{123}aa\=offset=5
|
||||
No match
|
||||
|
||||
/\x{1234}+/Ii,utf
|
||||
Capturing subpattern count = 0
|
||||
@ -1281,8 +1265,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[z-\x{100}]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -1467,8 +1449,7 @@ Subject length lower bound = 1
|
||||
0: \x{105}
|
||||
\x{109}
|
||||
0: \x{109}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{100}
|
||||
No match
|
||||
\x{10a}
|
||||
@ -1507,8 +1488,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{101}
|
||||
0: \x{101}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{102}
|
||||
No match
|
||||
Y
|
||||
@ -1547,7 +1527,52 @@ Last code unit = 'B' (caseless)
|
||||
Subject length lower bound = 2
|
||||
|
||||
/abc/utf,replace=�
|
||||
abc
|
||||
abc
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end
|
||||
|
||||
/(?<=(a)(?-1))x/I,utf
|
||||
Capturing subpattern count = 1
|
||||
Max lookbehind = 2
|
||||
Options: utf
|
||||
First code unit = 'x'
|
||||
Subject length lower bound = 1
|
||||
a\x80zx\=offset=3
|
||||
Failed: error -22: UTF-8 error: isolated byte with 0x80 bit set at offset 1
|
||||
|
||||
/[\W\p{Any}]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{Any}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
123
|
||||
0: 1
|
||||
|
||||
/[\W\pL]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{L}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
\= Expect no match
|
||||
123
|
||||
No match
|
||||
|
||||
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf
|
||||
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||
|
||||
/[\s[:^ascii:]]/B,ucp
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff\p{Xsp}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
# End of testinput10
|
||||
|
29
pcre2/testdata/testoutput11-16
vendored
29
pcre2/testdata/testoutput11-16
vendored
@ -4,13 +4,8 @@
|
||||
# different, so they have separate output files.
|
||||
|
||||
#forbid_utf
|
||||
#newline_default LF ANY ANYCRLF
|
||||
|
||||
/a\Cb/
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x0ab
|
||||
|
||||
/[^\x{c4}]/IB
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -581,7 +576,7 @@ Failed: error 134 at offset 11: character code point value in \x{} or \o{} is to
|
||||
|
||||
# Non-UTF characters
|
||||
|
||||
/\C{2,3}/
|
||||
/.{2,3}/
|
||||
\x{400000}\x{400001}\x{400002}\x{400003}
|
||||
** Character \x{400000} is greater than 0xffff and UTF-16 mode is not enabled.
|
||||
** Truncation will probably give the wrong result.
|
||||
@ -646,4 +641,24 @@ Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
||||
\xfc \xfd \xfe \xff
|
||||
Subject length lower bound = 1
|
||||
|
||||
/(*THEN:\[A]{65501})/expand
|
||||
|
||||
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
|
||||
# even though this test is run when UTF is not supported.
|
||||
|
||||
/ab������z/utf8_input
|
||||
** Failed: character value greater than 0xffff cannot be converted to 16-bit in non-UTF mode
|
||||
ab������z
|
||||
ab\x{7fffffff}z
|
||||
|
||||
/ab�������z/utf8_input
|
||||
** Failed: invalid UTF-8 string cannot be converted to 16-bit string
|
||||
ab�������z
|
||||
ab\x{ffffffff}z
|
||||
|
||||
/ab�Az/utf8_input
|
||||
** Failed: invalid UTF-8 string cannot be converted to 16-bit string
|
||||
ab�Az
|
||||
ab\x{80000041}z
|
||||
|
||||
# End of testinput11
|
||||
|
32
pcre2/testdata/testoutput11-32
vendored
32
pcre2/testdata/testoutput11-32
vendored
@ -4,13 +4,8 @@
|
||||
# different, so they have separate output files.
|
||||
|
||||
#forbid_utf
|
||||
#newline_default LF ANY ANYCRLF
|
||||
|
||||
/a\Cb/
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x0ab
|
||||
|
||||
/[^\x{c4}]/IB
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -582,7 +577,7 @@ Subject length lower bound = 2
|
||||
|
||||
# Non-UTF characters
|
||||
|
||||
/\C{2,3}/
|
||||
/.{2,3}/
|
||||
\x{400000}\x{400001}\x{400002}\x{400003}
|
||||
0: \x{400000}\x{400001}\x{400002}
|
||||
|
||||
@ -649,4 +644,27 @@ Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
||||
\xfc \xfd \xfe \xff
|
||||
Subject length lower bound = 1
|
||||
|
||||
/(*THEN:\[A]{65501})/expand
|
||||
|
||||
# We can use pcre2test's utf8_input modifier to create wide pattern characters,
|
||||
# even though this test is run when UTF is not supported.
|
||||
|
||||
/ab������z/utf8_input
|
||||
ab������z
|
||||
0: ab\x{7fffffff}z
|
||||
ab\x{7fffffff}z
|
||||
0: ab\x{7fffffff}z
|
||||
|
||||
/ab�������z/utf8_input
|
||||
ab�������z
|
||||
0: ab\x{ffffffff}z
|
||||
ab\x{ffffffff}z
|
||||
0: ab\x{ffffffff}z
|
||||
|
||||
/ab�Az/utf8_input
|
||||
ab�Az
|
||||
0: ab\x{80000041}z
|
||||
ab\x{80000041}z
|
||||
0: ab\x{80000041}z
|
||||
|
||||
# End of testinput11
|
||||
|
201
pcre2/testdata/testoutput12-16
vendored
201
pcre2/testdata/testoutput12-16
vendored
@ -9,78 +9,6 @@
|
||||
�]
|
||||
** Failed: invalid UTF-8 string cannot be used as input in UTF mode
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
1: \x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}Y
|
||||
1: \x{11234}Y
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
X\x{11234}YZW
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
0: X\x{11234}\x{512}
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
0: X\x{11234}\x{512}\x{11234}
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}
|
||||
*** Failers
|
||||
No match
|
||||
X\x{11234}
|
||||
No match
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{12257}b
|
||||
0: a\x{12257}b
|
||||
a\x{12257}\x{11234}b
|
||||
No match
|
||||
** Failers
|
||||
No match
|
||||
a\x{100}b
|
||||
No match
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# Check maximum character size
|
||||
|
||||
/\x{ffff}/IB,utf
|
||||
@ -310,29 +238,6 @@ First code unit = \x{d844}
|
||||
Last code unit = \x{deab}
|
||||
Subject length lower bound = 1
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single-byte is displayed. (In Perl it becomes a character, and you
|
||||
# can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
2:
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
*** Failers
|
||||
No match
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
||||
/[^ab\xC0-\xF0]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -362,8 +267,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{1000}
|
||||
0: \x{1000}
|
||||
*** Failers
|
||||
0: *
|
||||
\= Expect no match
|
||||
\x{c0}
|
||||
No match
|
||||
\x{f0}
|
||||
@ -520,8 +424,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[\xff]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -607,30 +509,38 @@ Subject length lower bound = 2
|
||||
Failed: error 106 at offset 13: missing terminating ] for character class
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{d800}\=no_utf_check
|
||||
0: X
|
||||
XX\x{da00}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{da00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dc00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{dc00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{de00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{de00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dfff}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{dfff}\=no_utf_check
|
||||
0: X
|
||||
\= Expect UTF error
|
||||
XX\x{d800}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{da00}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{dc00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{de00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{dfff}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{110000}
|
||||
** Failed: character \x{110000} is greater than 0x10ffff and so cannot be converted to UTF-16
|
||||
XX\x{d800}\x{1234}
|
||||
Failed: error -25: UTF-16 error: invalid low surrogate at offset 3
|
||||
\= Expect no match
|
||||
XX\x{d800}\=offset=3
|
||||
No match
|
||||
|
||||
/(?<=.)X/utf
|
||||
XX\x{d800}\=offset=3
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
|
||||
/(*UTF16)\x{11234}/
|
||||
abcd\x{11234}pqr
|
||||
@ -647,7 +557,7 @@ Subject length lower bound = 1
|
||||
0: \x{11234}
|
||||
|
||||
/(*UTF-32)\x{11234}/
|
||||
Failed: error 134 at offset 17: character code point value in \x{} or \o{} is too large
|
||||
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
|
||||
abcd\x{11234}pqr
|
||||
|
||||
/(*UTF-32)\x{112}/
|
||||
@ -788,8 +698,10 @@ Subject length lower bound = 3
|
||||
0: aa
|
||||
a\x{123}aa\=offset=3
|
||||
0: a
|
||||
\= Expect no match
|
||||
a\x{123}aa\=offset=4
|
||||
No match
|
||||
\= Expect bad offset error
|
||||
a\x{123}aa\=offset=5
|
||||
Failed: error -33: bad offset value
|
||||
a\x{123}aa\=offset=6
|
||||
@ -854,16 +766,21 @@ Subject length lower bound = 1
|
||||
# Check bad offset
|
||||
|
||||
/a/utf
|
||||
\= Expect bad UTF-16 offset, or no match in 32-bit
|
||||
\x{10000}\=offset=1
|
||||
Error -36 (bad UTF-16 offset)
|
||||
\x{10000}ab\=offset=1
|
||||
Error -36 (bad UTF-16 offset)
|
||||
\= Expect 16-bit match, 32-bit no match
|
||||
\x{10000}ab\=offset=2
|
||||
0: a
|
||||
\= Expect no match
|
||||
\x{10000}ab\=offset=3
|
||||
No match
|
||||
\= Expect no match in 16-bit, bad offset in 32-bit
|
||||
\x{10000}ab\=offset=4
|
||||
No match
|
||||
\= Expect bad offset
|
||||
\x{10000}ab\=offset=5
|
||||
Failed: error -33: bad offset value
|
||||
|
||||
@ -1123,10 +1040,6 @@ Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too
|
||||
/\o{4200000}/utf
|
||||
Failed: error 134 at offset 10: character code point value in \x{} or \o{} is too large
|
||||
|
||||
/\C/utf
|
||||
\x{110000}
|
||||
** Failed: character \x{110000} is greater than 0x10ffff and so cannot be converted to UTF-16
|
||||
|
||||
/\x{100}*A/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -1174,8 +1087,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[z-\x{100}]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -1365,8 +1276,7 @@ Subject length lower bound = 1
|
||||
0: \x{105}
|
||||
\x{109}
|
||||
0: \x{109}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{100}
|
||||
No match
|
||||
\x{10a}
|
||||
@ -1410,8 +1320,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{101}
|
||||
0: \x{101}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{102}
|
||||
No match
|
||||
Y
|
||||
@ -1454,4 +1363,56 @@ Starting code units: \xff
|
||||
Last code unit = 'B' (caseless)
|
||||
Subject length lower bound = 2
|
||||
|
||||
/./utf
|
||||
\x{110000}
|
||||
** Failed: character \x{110000} is greater than 0x10ffff and so cannot be converted to UTF-16
|
||||
|
||||
/(*UTF)ab������z/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
ab\x{fd}\x{bf}\x{bf}\x{bf}\x{bf}\x{bf}z
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/ab������z/utf
|
||||
** Failed: character value greater than 0x10ffff cannot be converted to UTF
|
||||
|
||||
/[\W\p{Any}]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{Any}\x{100}-\x{ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
123
|
||||
0: 1
|
||||
|
||||
/[\W\pL]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{L}\x{100}-\x{ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
\x{100}
|
||||
0: \x{100}
|
||||
\x{308}
|
||||
0: \x{308}
|
||||
\= Expect no match
|
||||
123
|
||||
No match
|
||||
|
||||
/[\s[:^ascii:]]/B,ucp
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff\p{Xsp}\x{100}-\x{ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
# End of testinput12
|
||||
|
197
pcre2/testdata/testoutput12-32
vendored
197
pcre2/testdata/testoutput12-32
vendored
@ -9,76 +9,6 @@
|
||||
�]
|
||||
** Failed: invalid UTF-8 string cannot be used as input in UTF mode
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{11234}YZ
|
||||
No match
|
||||
X\x{11234}YZW
|
||||
0: X\x{11234}YZW
|
||||
1: \x{11234}YZW
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
No match
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
0: X\x{11234}\x{512}\x{11234}Z
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}Y
|
||||
*** Failers
|
||||
No match
|
||||
X\x{11234}
|
||||
No match
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{12257}b
|
||||
No match
|
||||
a\x{12257}\x{11234}b
|
||||
0: a\x{12257}\x{11234}b
|
||||
** Failers
|
||||
No match
|
||||
a\x{100}b
|
||||
No match
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# Check maximum character size
|
||||
|
||||
/\x{ffff}/IB,utf
|
||||
@ -303,29 +233,6 @@ Options: utf
|
||||
First code unit = \x{212ab}
|
||||
Subject length lower bound = 1
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single-byte is displayed. (In Perl it becomes a character, and you
|
||||
# can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
2:
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
*** Failers
|
||||
No match
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
||||
/[^ab\xC0-\xF0]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -355,8 +262,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{1000}
|
||||
0: \x{1000}
|
||||
*** Failers
|
||||
0: *
|
||||
\= Expect no match
|
||||
\x{c0}
|
||||
No match
|
||||
\x{f0}
|
||||
@ -513,8 +419,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[\xff]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -600,30 +504,38 @@ Subject length lower bound = 2
|
||||
Failed: error 106 at offset 13: missing terminating ] for character class
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{d800}\=no_utf_check
|
||||
0: X
|
||||
XX\x{da00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{da00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dc00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dc00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{de00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{de00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dfff}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dfff}\=no_utf_check
|
||||
0: X
|
||||
\= Expect UTF error
|
||||
XX\x{d800}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{da00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dc00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{de00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dfff}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{110000}
|
||||
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 2
|
||||
XX\x{d800}\x{1234}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
\= Expect no match
|
||||
XX\x{d800}\=offset=3
|
||||
No match
|
||||
|
||||
/(?<=.)X/utf
|
||||
XX\x{d800}\=offset=3
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
|
||||
/(*UTF16)\x{11234}/
|
||||
Failed: error 160 at offset 5: (*VERB) not recognized or malformed
|
||||
@ -780,8 +692,10 @@ Subject length lower bound = 3
|
||||
0: aa
|
||||
a\x{123}aa\=offset=3
|
||||
0: a
|
||||
\= Expect no match
|
||||
a\x{123}aa\=offset=4
|
||||
No match
|
||||
\= Expect bad offset error
|
||||
a\x{123}aa\=offset=5
|
||||
Failed: error -33: bad offset value
|
||||
a\x{123}aa\=offset=6
|
||||
@ -846,16 +760,21 @@ Subject length lower bound = 1
|
||||
# Check bad offset
|
||||
|
||||
/a/utf
|
||||
\= Expect bad UTF-16 offset, or no match in 32-bit
|
||||
\x{10000}\=offset=1
|
||||
No match
|
||||
\x{10000}ab\=offset=1
|
||||
0: a
|
||||
\= Expect 16-bit match, 32-bit no match
|
||||
\x{10000}ab\=offset=2
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{10000}ab\=offset=3
|
||||
No match
|
||||
\= Expect no match in 16-bit, bad offset in 32-bit
|
||||
\x{10000}ab\=offset=4
|
||||
Failed: error -33: bad offset value
|
||||
\= Expect bad offset
|
||||
\x{10000}ab\=offset=5
|
||||
Failed: error -33: bad offset value
|
||||
|
||||
@ -1115,10 +1034,6 @@ Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too
|
||||
/\o{4200000}/utf
|
||||
Failed: error 134 at offset 10: character code point value in \x{} or \o{} is too large
|
||||
|
||||
/\C/utf
|
||||
\x{110000}
|
||||
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 0
|
||||
|
||||
/\x{100}*A/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -1166,8 +1081,6 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{100}Z
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
No match
|
||||
|
||||
/[z-\x{100}]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
@ -1357,8 +1270,7 @@ Subject length lower bound = 1
|
||||
0: \x{105}
|
||||
\x{109}
|
||||
0: \x{109}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{100}
|
||||
No match
|
||||
\x{10a}
|
||||
@ -1402,8 +1314,7 @@ Subject length lower bound = 1
|
||||
0: \x{100}
|
||||
\x{101}
|
||||
0: \x{101}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{102}
|
||||
No match
|
||||
Y
|
||||
@ -1446,4 +1357,56 @@ Starting code units: \xff
|
||||
Last code unit = 'B' (caseless)
|
||||
Subject length lower bound = 2
|
||||
|
||||
/./utf
|
||||
\x{110000}
|
||||
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 0
|
||||
|
||||
/(*UTF)ab������z/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
ab\x{fd}\x{bf}\x{bf}\x{bf}\x{bf}\x{bf}z
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/ab������z/utf
|
||||
** Failed: character value greater than 0x10ffff cannot be converted to UTF
|
||||
|
||||
/[\W\p{Any}]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{Any}\x{100}-\x{ffffffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
123
|
||||
0: 1
|
||||
|
||||
/[\W\pL]/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-/:-@[-^`{-\xff\p{L}\x{100}-\x{ffffffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
abc
|
||||
0: a
|
||||
\x{100}
|
||||
0: \x{100}
|
||||
\x{308}
|
||||
0: \x{308}
|
||||
\= Expect no match
|
||||
123
|
||||
No match
|
||||
|
||||
/[\s[:^ascii:]]/B,ucp
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff\p{Xsp}\x{100}-\x{ffffffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
# End of testinput12
|
||||
|
242
pcre2/testdata/testoutput14
vendored
242
pcre2/testdata/testoutput14
vendored
@ -1,242 +0,0 @@
|
||||
# These are:
|
||||
#
|
||||
# (1) Tests of the match-limiting features. The results are different for
|
||||
# interpretive or JIT matching, so this test should not be run with JIT. The
|
||||
# same tests are run using JIT in test 16.
|
||||
|
||||
# (2) Other tests that must not be run with JIT.
|
||||
|
||||
/(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
|
||||
Minimum match limit = 8
|
||||
Minimum recursion limit = 6
|
||||
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazz
|
||||
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
|
||||
aaaaaaaaaaaaaz\=find_limits
|
||||
Minimum match limit = 32768
|
||||
Minimum recursion limit = 29
|
||||
No match
|
||||
|
||||
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
||||
Capturing subpattern count = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
/* this is a C style comment */\=find_limits
|
||||
Minimum match limit = 120
|
||||
Minimum recursion limit = 6
|
||||
0: /* this is a C style comment */
|
||||
1: /* this is a C style comment */
|
||||
|
||||
/^(?>a)++/
|
||||
aa\=find_limits
|
||||
Minimum match limit = 5
|
||||
Minimum recursion limit = 2
|
||||
0: aa
|
||||
aaaaaaaaa\=find_limits
|
||||
Minimum match limit = 12
|
||||
Minimum recursion limit = 2
|
||||
0: aaaaaaaaa
|
||||
|
||||
/(a)(?1)++/
|
||||
aa\=find_limits
|
||||
Minimum match limit = 7
|
||||
Minimum recursion limit = 4
|
||||
0: aa
|
||||
1: a
|
||||
aaaaaaaaa\=find_limits
|
||||
Minimum match limit = 21
|
||||
Minimum recursion limit = 4
|
||||
0: aaaaaaaaa
|
||||
1: a
|
||||
|
||||
/a(?:.)*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 65
|
||||
Minimum recursion limit = 2
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/a(?:.(*THEN))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 86
|
||||
Minimum recursion limit = 45
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/a(?:.(*THEN:ABC))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 86
|
||||
Minimum recursion limit = 45
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 7
|
||||
Minimum recursion limit = 2
|
||||
0: aabbccddee
|
||||
|
||||
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 17
|
||||
Minimum recursion limit = 16
|
||||
0: aabbccddee
|
||||
1: aa
|
||||
2: bb
|
||||
3: cc
|
||||
4: dd
|
||||
5: ee
|
||||
|
||||
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 13
|
||||
Minimum recursion limit = 10
|
||||
0: aabbccddee
|
||||
1: aa
|
||||
2: cc
|
||||
3: ee
|
||||
|
||||
/(*LIMIT_MATCH=12bc)abc/
|
||||
Failed: error 160 at offset 0: (*VERB) not recognized or malformed
|
||||
|
||||
/(*LIMIT_MATCH=4294967290)abc/
|
||||
Failed: error 160 at offset 0: (*VERB) not recognized or malformed
|
||||
|
||||
/(*LIMIT_RECURSION=4294967280)abc/I
|
||||
Capturing subpattern count = 0
|
||||
Recursion limit = 4294967280
|
||||
First code unit = 'a'
|
||||
Last code unit = 'c'
|
||||
Subject length lower bound = 3
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 3000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -47: match limit exceeded
|
||||
aaaaaaaaaaaaaz\=match_limit=60000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 3000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 60000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_RECURSION=10)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 10
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -53: recursion limit exceeded
|
||||
aaaaaaaaaaaaaz\=recursion_limit=1000
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
/(*LIMIT_RECURSION=10)(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 1000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
|
||||
/(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 1000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
# These three have infinitely nested recursions.
|
||||
|
||||
/((?2))((?1))/
|
||||
abc
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/((?(R2)a+|(?1)b))/
|
||||
aaaabcde
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?(R)a*(?1)|((?R))b)/
|
||||
aaaabcde
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
# The allusedtext modifier does not work with JIT, which does not maintain
|
||||
# the leftchar/rightchar data.
|
||||
|
||||
/abc(?=xyz)/allusedtext
|
||||
abcxyzpqr
|
||||
0: abcxyz
|
||||
>>>
|
||||
abcxyzpqr\=aftertext
|
||||
0: abcxyz
|
||||
>>>
|
||||
0+ xyzpqr
|
||||
|
||||
/(?<=pqr)abc(?=xyz)/allusedtext
|
||||
xyzpqrabcxyzpqr
|
||||
0: pqrabcxyz
|
||||
<<< >>>
|
||||
xyzpqrabcxyzpqr\=aftertext
|
||||
0: pqrabcxyz
|
||||
<<< >>>
|
||||
0+ xyzpqr
|
||||
|
||||
/a\b/
|
||||
a.\=allusedtext
|
||||
0: a.
|
||||
>
|
||||
a\=allusedtext
|
||||
0: a
|
||||
|
||||
/abc\Kxyz/
|
||||
abcxyz\=allusedtext
|
||||
0: abcxyz
|
||||
<<<
|
||||
|
||||
/abc(?=xyz(*ACCEPT))/
|
||||
abcxyz\=allusedtext
|
||||
0: abcxyz
|
||||
>>>
|
||||
|
||||
/abc(?=abcde)(?=ab)/allusedtext
|
||||
abcabcdefg
|
||||
0: abcabcde
|
||||
>>>>>
|
||||
|
||||
# End of testinput14
|
61
pcre2/testdata/testoutput14-16
vendored
Normal file
61
pcre2/testdata/testoutput14-16
vendored
Normal file
@ -0,0 +1,61 @@
|
||||
# These test special (mostly error) UTF features of DFA matching. They are a
|
||||
# selection of the more comprehensive tests that are run for non-DFA matching.
|
||||
# The output is different for the different widths.
|
||||
|
||||
#subject dfa
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{d800}\=offset=3
|
||||
No match
|
||||
XX\x{d800}\=no_utf_check
|
||||
0: X
|
||||
XX\x{da00}
|
||||
Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||
XX\x{da00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dc00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{dc00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{de00}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{de00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dfff}
|
||||
Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
|
||||
XX\x{dfff}\=no_utf_check
|
||||
0: X
|
||||
XX\x{110000}
|
||||
** Failed: character \x{110000} is greater than 0x10ffff and so cannot be converted to UTF-16
|
||||
XX\x{d800}\x{1234}
|
||||
Failed: error -25: UTF-16 error: invalid low surrogate at offset 3
|
||||
|
||||
/badutf/utf
|
||||
X\xdf
|
||||
No match
|
||||
XX\xef
|
||||
No match
|
||||
XXX\xef\x80
|
||||
No match
|
||||
X\xf7
|
||||
No match
|
||||
XX\xf7\x80
|
||||
No match
|
||||
XXX\xf7\x80\x80
|
||||
No match
|
||||
|
||||
/shortutf/utf
|
||||
XX\xdf\=ph
|
||||
No match
|
||||
XX\xef\=ph
|
||||
No match
|
||||
XX\xef\x80\=ph
|
||||
No match
|
||||
\xf7\=ph
|
||||
No match
|
||||
\xf7\x80\=ph
|
||||
No match
|
||||
|
||||
# End of testinput14
|
61
pcre2/testdata/testoutput14-32
vendored
Normal file
61
pcre2/testdata/testoutput14-32
vendored
Normal file
@ -0,0 +1,61 @@
|
||||
# These test special (mostly error) UTF features of DFA matching. They are a
|
||||
# selection of the more comprehensive tests that are run for non-DFA matching.
|
||||
# The output is different for the different widths.
|
||||
|
||||
#subject dfa
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{d800}\=offset=3
|
||||
No match
|
||||
XX\x{d800}\=no_utf_check
|
||||
0: X
|
||||
XX\x{da00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{da00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dc00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dc00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{de00}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{de00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dfff}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dfff}\=no_utf_check
|
||||
0: X
|
||||
XX\x{110000}
|
||||
Failed: error -28: UTF-32 error: code points greater than 0x10ffff are not defined at offset 2
|
||||
XX\x{d800}\x{1234}
|
||||
Failed: error -27: UTF-32 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
|
||||
/badutf/utf
|
||||
X\xdf
|
||||
No match
|
||||
XX\xef
|
||||
No match
|
||||
XXX\xef\x80
|
||||
No match
|
||||
X\xf7
|
||||
No match
|
||||
XX\xf7\x80
|
||||
No match
|
||||
XXX\xf7\x80\x80
|
||||
No match
|
||||
|
||||
/shortutf/utf
|
||||
XX\xdf\=ph
|
||||
No match
|
||||
XX\xef\=ph
|
||||
No match
|
||||
XX\xef\x80\=ph
|
||||
No match
|
||||
\xf7\=ph
|
||||
No match
|
||||
\xf7\x80\=ph
|
||||
No match
|
||||
|
||||
# End of testinput14
|
61
pcre2/testdata/testoutput14-8
vendored
Normal file
61
pcre2/testdata/testoutput14-8
vendored
Normal file
@ -0,0 +1,61 @@
|
||||
# These test special (mostly error) UTF features of DFA matching. They are a
|
||||
# selection of the more comprehensive tests that are run for non-DFA matching.
|
||||
# The output is different for the different widths.
|
||||
|
||||
#subject dfa
|
||||
|
||||
/X/utf
|
||||
XX\x{d800}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{d800}\=offset=3
|
||||
Error -36 (bad UTF-8 offset)
|
||||
XX\x{d800}\=no_utf_check
|
||||
0: X
|
||||
XX\x{da00}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{da00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dc00}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dc00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{de00}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{de00}\=no_utf_check
|
||||
0: X
|
||||
XX\x{dfff}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
XX\x{dfff}\=no_utf_check
|
||||
0: X
|
||||
XX\x{110000}
|
||||
Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 2
|
||||
XX\x{d800}\x{1234}
|
||||
Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
|
||||
|
||||
/badutf/utf
|
||||
X\xdf
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 1
|
||||
XX\xef
|
||||
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
|
||||
XXX\xef\x80
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
|
||||
X\xf7
|
||||
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 1
|
||||
XX\xf7\x80
|
||||
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
|
||||
XXX\xf7\x80\x80
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
|
||||
|
||||
/shortutf/utf
|
||||
XX\xdf\=ph
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
|
||||
XX\xef\=ph
|
||||
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
|
||||
XX\xef\x80\=ph
|
||||
Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
|
||||
\xf7\=ph
|
||||
Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
|
||||
\xf7\x80\=ph
|
||||
Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
|
||||
|
||||
# End of testinput14
|
385
pcre2/testdata/testoutput15
vendored
385
pcre2/testdata/testoutput15
vendored
@ -1,17 +1,390 @@
|
||||
# This test is run only when JIT support is not available. It checks that an
|
||||
# attempt to use it has the expected behaviour. It also tests things that
|
||||
# are different without JIT.
|
||||
# These are:
|
||||
#
|
||||
# (1) Tests of the match-limiting features. The results are different for
|
||||
# interpretive or JIT matching, so this test should not be run with JIT. The
|
||||
# same tests are run using JIT in test 17.
|
||||
|
||||
/abc/I,jit,jitverify
|
||||
# (2) Other tests that must not be run with JIT.
|
||||
|
||||
/(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
|
||||
Minimum match limit = 8
|
||||
Minimum recursion limit = 6
|
||||
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazz
|
||||
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
|
||||
aaaaaaaaaaaaaz\=find_limits
|
||||
Minimum match limit = 32768
|
||||
Minimum recursion limit = 29
|
||||
No match
|
||||
|
||||
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
||||
Capturing subpattern count = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
/* this is a C style comment */\=find_limits
|
||||
Minimum match limit = 120
|
||||
Minimum recursion limit = 6
|
||||
0: /* this is a C style comment */
|
||||
1: /* this is a C style comment */
|
||||
|
||||
/^(?>a)++/
|
||||
aa\=find_limits
|
||||
Minimum match limit = 5
|
||||
Minimum recursion limit = 2
|
||||
0: aa
|
||||
aaaaaaaaa\=find_limits
|
||||
Minimum match limit = 12
|
||||
Minimum recursion limit = 2
|
||||
0: aaaaaaaaa
|
||||
|
||||
/(a)(?1)++/
|
||||
aa\=find_limits
|
||||
Minimum match limit = 7
|
||||
Minimum recursion limit = 4
|
||||
0: aa
|
||||
1: a
|
||||
aaaaaaaaa\=find_limits
|
||||
Minimum match limit = 21
|
||||
Minimum recursion limit = 4
|
||||
0: aaaaaaaaa
|
||||
1: a
|
||||
|
||||
/a(?:.)*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 65
|
||||
Minimum recursion limit = 2
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/a(?:.(*THEN))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 86
|
||||
Minimum recursion limit = 45
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/a(?:.(*THEN:ABC))*?a/ims
|
||||
abbbbbbbbbbbbbbbbbbbbba\=find_limits
|
||||
Minimum match limit = 86
|
||||
Minimum recursion limit = 45
|
||||
0: abbbbbbbbbbbbbbbbbbbbba
|
||||
|
||||
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 7
|
||||
Minimum recursion limit = 2
|
||||
0: aabbccddee
|
||||
|
||||
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 17
|
||||
Minimum recursion limit = 16
|
||||
0: aabbccddee
|
||||
1: aa
|
||||
2: bb
|
||||
3: cc
|
||||
4: dd
|
||||
5: ee
|
||||
|
||||
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
|
||||
aabbccddee\=find_limits
|
||||
Minimum match limit = 13
|
||||
Minimum recursion limit = 10
|
||||
0: aabbccddee
|
||||
1: aa
|
||||
2: cc
|
||||
3: ee
|
||||
|
||||
/(*LIMIT_MATCH=12bc)abc/
|
||||
Failed: error 160 at offset 17: (*VERB) not recognized or malformed
|
||||
|
||||
/(*LIMIT_MATCH=4294967290)abc/
|
||||
Failed: error 160 at offset 24: (*VERB) not recognized or malformed
|
||||
|
||||
/(*LIMIT_RECURSION=4294967280)abc/I
|
||||
Capturing subpattern count = 0
|
||||
Recursion limit = 4294967280
|
||||
First code unit = 'a'
|
||||
Last code unit = 'c'
|
||||
Subject length lower bound = 3
|
||||
JIT support is not available in this version of PCRE2
|
||||
|
||||
/a*/I
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(a+)*zz/
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 3000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -47: match limit exceeded
|
||||
aaaaaaaaaaaaaz\=match_limit=60000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 3000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Match limit = 60000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=match_limit=3000
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
/(*LIMIT_RECURSION=10)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 10
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
Failed: error -53: recursion limit exceeded
|
||||
aaaaaaaaaaaaaz\=recursion_limit=1000
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
/(*LIMIT_RECURSION=10)(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 1000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
|
||||
/(*LIMIT_RECURSION=1000)(a+)*zz/I
|
||||
Capturing subpattern count = 1
|
||||
Recursion limit = 1000
|
||||
Starting code units: a z
|
||||
Last code unit = 'z'
|
||||
Subject length lower bound = 2
|
||||
aaaaaaaaaaaaaz
|
||||
No match
|
||||
aaaaaaaaaaaaaz\=recursion_limit=10
|
||||
Failed: error -53: recursion limit exceeded
|
||||
|
||||
# These three have infinitely nested recursions.
|
||||
|
||||
/((?2))((?1))/
|
||||
abc
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/((?(R2)a+|(?1)b))()/
|
||||
aaaabcde
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?(R)a*(?1)|((?R))b)/
|
||||
aaaabcde
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
# The allusedtext modifier does not work with JIT, which does not maintain
|
||||
# the leftchar/rightchar data.
|
||||
|
||||
/abc(?=xyz)/allusedtext
|
||||
abcxyzpqr
|
||||
0: abcxyz
|
||||
>>>
|
||||
abcxyzpqr\=aftertext
|
||||
0: abcxyz
|
||||
>>>
|
||||
0+ xyzpqr
|
||||
|
||||
/(?<=pqr)abc(?=xyz)/allusedtext
|
||||
xyzpqrabcxyzpqr
|
||||
0: pqrabcxyz
|
||||
<<< >>>
|
||||
xyzpqrabcxyzpqr\=aftertext
|
||||
0: pqrabcxyz
|
||||
<<< >>>
|
||||
0+ xyzpqr
|
||||
|
||||
/a\b/
|
||||
a.\=allusedtext
|
||||
0: a.
|
||||
>
|
||||
a\=allusedtext
|
||||
0: a
|
||||
|
||||
/abc\Kxyz/
|
||||
abcxyz\=allusedtext
|
||||
0: abcxyz
|
||||
<<<
|
||||
|
||||
/abc(?=xyz(*ACCEPT))/
|
||||
abcxyz\=allusedtext
|
||||
0: abcxyz
|
||||
>>>
|
||||
|
||||
/abc(?=abcde)(?=ab)/allusedtext
|
||||
abcabcdefg
|
||||
0: abcabcde
|
||||
>>>>>
|
||||
|
||||
# These tests provoke recursion loops, which give a different error message
|
||||
# when JIT is used.
|
||||
|
||||
/(?R)/I
|
||||
Capturing subpattern count = 0
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(a|(?R))/I
|
||||
Capturing subpattern count = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0: a
|
||||
1: a
|
||||
defg
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(ab|(bc|(de|(?R))))/I
|
||||
Capturing subpattern count = 3
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0: ab
|
||||
1: ab
|
||||
fghi
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(ab|(bc|(de|(?1))))/I
|
||||
Capturing subpattern count = 3
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0: ab
|
||||
1: ab
|
||||
fghi
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/x(ab|(bc|(de|(?1)x)x)x)/I
|
||||
Capturing subpattern count = 3
|
||||
First code unit = 'x'
|
||||
Subject length lower bound = 3
|
||||
xab123
|
||||
0: xab
|
||||
1: ab
|
||||
xfghi
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?!\w)(?R)/
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
=abc
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?=\w)(?R)/
|
||||
=abc
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?<!\w)(?R)/
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(?<=\w)(?R)/
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/(a+|(?R)b)/
|
||||
aaa
|
||||
0: aaa
|
||||
1: aaa
|
||||
bbb
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
/[^\xff]((?1))/BI
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[^\x{ff}]
|
||||
CBra 1
|
||||
Recurse
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Subject length lower bound = 1
|
||||
abcd
|
||||
Failed: error -52: nested recursion at the same subject position
|
||||
|
||||
# These tests don't behave the same with JIT
|
||||
|
||||
/\w+(?C1)/BI,no_auto_possess
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
\w+
|
||||
Callout 1 8 0
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 0
|
||||
Options: no_auto_possess
|
||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||
Subject length lower bound = 1
|
||||
abc\=callout_fail=1
|
||||
--->abc
|
||||
1 ^ ^
|
||||
1 ^ ^
|
||||
1 ^^
|
||||
1 ^ ^
|
||||
1 ^^
|
||||
1 ^^
|
||||
No match
|
||||
|
||||
/(*NO_AUTO_POSSESS)\w+(?C1)/BI
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
\w+
|
||||
Callout 1 26 0
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 0
|
||||
Compile options: <none>
|
||||
Overall options: no_auto_possess
|
||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||
Subject length lower bound = 1
|
||||
abc\=callout_fail=1
|
||||
--->abc
|
||||
1 ^ ^
|
||||
1 ^ ^
|
||||
1 ^^
|
||||
1 ^ ^
|
||||
1 ^^
|
||||
1 ^^
|
||||
No match
|
||||
|
||||
# This test breaks the JIT stack limit
|
||||
|
||||
/(|]+){2,2452}/
|
||||
(|]+){2,2452}
|
||||
0:
|
||||
1:
|
||||
|
||||
# End of testinput15
|
||||
|
387
pcre2/testdata/testoutput16
vendored
387
pcre2/testdata/testoutput16
vendored
File diff suppressed because one or more lines are too long
648
pcre2/testdata/testoutput17
vendored
648
pcre2/testdata/testoutput17
vendored
File diff suppressed because one or more lines are too long
177
pcre2/testdata/testoutput18
vendored
177
pcre2/testdata/testoutput18
vendored
@ -1,20 +1,171 @@
|
||||
# This set of tests is run only with the 8-bit library. It tests the POSIX
|
||||
# interface with UTF/UCP support, which is supported only with the 8-bit
|
||||
# library. This test should not be run with JIT (which is not available for the
|
||||
# POSIX interface).
|
||||
# interface, which is supported only with the 8-bit library. This test should
|
||||
# not be run with JIT (which is not available for the POSIX interface).
|
||||
|
||||
#forbid_utf
|
||||
#pattern posix
|
||||
|
||||
/a\x{1234}b/utf
|
||||
a\x{1234}b
|
||||
0: a\x{1234}b
|
||||
# Test invalid options
|
||||
|
||||
/\w/
|
||||
+++\x{c2}
|
||||
/abc/auto_callout
|
||||
** Ignored with POSIX interface: auto_callout
|
||||
|
||||
/abc/
|
||||
abc\=find_limits
|
||||
** Ignored with POSIX interface: find_limits
|
||||
0: abc
|
||||
|
||||
/abc/
|
||||
abc\=partial_hard
|
||||
** Ignored with POSIX interface: partial_hard
|
||||
0: abc
|
||||
|
||||
# Real tests
|
||||
|
||||
/abc/
|
||||
abc
|
||||
0: abc
|
||||
|
||||
/^abc|def/
|
||||
abcdef
|
||||
0: abc
|
||||
abcdef\=notbol
|
||||
0: def
|
||||
|
||||
/.*((abc)$|(def))/
|
||||
defabc
|
||||
0: defabc
|
||||
1: abc
|
||||
2: abc
|
||||
defabc\=noteol
|
||||
0: def
|
||||
1: def
|
||||
3: def
|
||||
|
||||
/the quick brown fox/
|
||||
the quick brown fox
|
||||
0: the quick brown fox
|
||||
\= Expect no match
|
||||
The Quick Brown Fox
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/\w/ucp
|
||||
+++\x{c2}
|
||||
0: \xc2
|
||||
|
||||
# End of testdata/testinput17
|
||||
/the quick brown fox/i
|
||||
the quick brown fox
|
||||
0: the quick brown fox
|
||||
The Quick Brown Fox
|
||||
0: The Quick Brown Fox
|
||||
|
||||
/(*LF)abc.def/
|
||||
\= Expect no match
|
||||
abc\ndef
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/(*LF)abc$/
|
||||
abc
|
||||
0: abc
|
||||
abc\n
|
||||
0: abc
|
||||
|
||||
/(abc)\2/
|
||||
Failed: POSIX code 15: bad back reference at offset 6
|
||||
|
||||
/(abc\1)/
|
||||
\= Expect no match
|
||||
abc
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/a*(b+)(z)(z)/
|
||||
aaaabbbbzzzz
|
||||
0: aaaabbbbzz
|
||||
1: bbbb
|
||||
2: z
|
||||
3: z
|
||||
aaaabbbbzzzz\=ovector=0
|
||||
Matched without capture
|
||||
aaaabbbbzzzz\=ovector=1
|
||||
0: aaaabbbbzz
|
||||
aaaabbbbzzzz\=ovector=2
|
||||
0: aaaabbbbzz
|
||||
1: bbbb
|
||||
|
||||
/(*ANY)ab.cd/
|
||||
ab-cd
|
||||
0: ab-cd
|
||||
ab=cd
|
||||
0: ab=cd
|
||||
\= Expect no match
|
||||
ab\ncd
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/ab.cd/s
|
||||
ab-cd
|
||||
0: ab-cd
|
||||
ab=cd
|
||||
0: ab=cd
|
||||
ab\ncd
|
||||
0: ab\x0acd
|
||||
|
||||
/a(b)c/posix_nosub
|
||||
abc
|
||||
Matched with REG_NOSUB
|
||||
|
||||
/a(?P<name>b)c/posix_nosub
|
||||
abc
|
||||
Matched with REG_NOSUB
|
||||
|
||||
/(a)\1/posix_nosub
|
||||
zaay
|
||||
Matched with REG_NOSUB
|
||||
|
||||
/a?|b?/
|
||||
abc
|
||||
0: a
|
||||
\= Expect no match
|
||||
ddd\=notempty
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/\w+A/
|
||||
CDAAAAB
|
||||
0: CDAAAA
|
||||
|
||||
/\w+A/ungreedy
|
||||
CDAAAAB
|
||||
0: CDA
|
||||
|
||||
/\Biss\B/I,aftertext
|
||||
** Ignored with POSIX interface: info
|
||||
Mississippi
|
||||
0: iss
|
||||
0+ issippi
|
||||
|
||||
/abc/\
|
||||
Failed: POSIX code 9: bad escape sequence at offset 4
|
||||
|
||||
"(?(?C)"
|
||||
Failed: POSIX code 11: unbalanced () at offset 6
|
||||
|
||||
"(?(?C))"
|
||||
Failed: POSIX code 3: pattern error at offset 6
|
||||
|
||||
/abcd/substitute_extended
|
||||
** Ignored with POSIX interface: substitute_extended
|
||||
|
||||
/\[A]{1000000}**/expand,regerror_buffsize=31
|
||||
Failed: POSIX code 4: ? * + invalid at offset 100000
|
||||
** regerror() message truncated
|
||||
|
||||
/\[A]{1000000}**/expand,regerror_buffsize=32
|
||||
Failed: POSIX code 4: ? * + invalid at offset 1000001
|
||||
|
||||
//posix_nosub
|
||||
\=offset=70000
|
||||
** Ignored with POSIX interface: offset
|
||||
Matched with REG_NOSUB
|
||||
|
||||
/(?=(a\K))/
|
||||
a
|
||||
Start of matched string is beyond its end - displaying from end to start.
|
||||
0: a
|
||||
1: a
|
||||
|
||||
# End of testdata/testinput18
|
||||
|
117
pcre2/testdata/testoutput19
vendored
117
pcre2/testdata/testoutput19
vendored
@ -1,100 +1,21 @@
|
||||
# This set of tests exercises the serialization/deserialization functions in
|
||||
# the library. It does not use UTF or JIT.
|
||||
|
||||
#forbid_utf
|
||||
|
||||
# Compile several patterns, push them onto the stack, and then write them
|
||||
# all to a file.
|
||||
|
||||
#pattern push
|
||||
|
||||
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
|
||||
(?(DEFINE)
|
||||
(?<NAME_PAT>[a-z]+)
|
||||
(?<ADDRESS_PAT>\d+)
|
||||
)/x
|
||||
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
|
||||
|
||||
#save testsaved1
|
||||
|
||||
# Do it again for some more patterns.
|
||||
|
||||
/(*MARK:A)(*SKIP:B)(C|X)/mark
|
||||
** Ignored when compiled pattern is stacked with 'push': mark
|
||||
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
|
||||
|
||||
#save testsaved2
|
||||
#pattern -push
|
||||
|
||||
# Reload the patterns, then pop them one by one and check them.
|
||||
|
||||
#load testsaved1
|
||||
#load testsaved2
|
||||
|
||||
#pop info
|
||||
Capturing subpattern count = 2
|
||||
Max back reference = 2
|
||||
Named capturing subpatterns:
|
||||
n 1
|
||||
n 2
|
||||
Options: dupnames
|
||||
Starting code units: b f
|
||||
Subject length lower bound = 6
|
||||
foofoo
|
||||
0: foofoo
|
||||
1: foo
|
||||
barbar
|
||||
0: barbar
|
||||
1: <unset>
|
||||
2: bar
|
||||
# This set of tests is run only with the 8-bit library. It tests the POSIX
|
||||
# interface with UTF/UCP support, which is supported only with the 8-bit
|
||||
# library. This test should not be run with JIT (which is not available for the
|
||||
# POSIX interface).
|
||||
|
||||
#pop mark
|
||||
C
|
||||
0: C
|
||||
1: C
|
||||
MK: A
|
||||
D
|
||||
No match, mark = A
|
||||
#pattern posix
|
||||
|
||||
/a\x{1234}b/utf
|
||||
a\x{1234}b
|
||||
0: a\x{1234}b
|
||||
|
||||
/\w/
|
||||
\= Expect no match
|
||||
+++\x{c2}
|
||||
No match: POSIX code 17: match failed
|
||||
|
||||
/\w/ucp
|
||||
+++\x{c2}
|
||||
0: \xc2
|
||||
|
||||
#pop
|
||||
AmanaplanacanalPanama
|
||||
0: AmanaplanacanalPanama
|
||||
1: <unset>
|
||||
2: <unset>
|
||||
3: AmanaplanacanalPanama
|
||||
4: A
|
||||
|
||||
#pop info
|
||||
Capturing subpattern count = 4
|
||||
Named capturing subpatterns:
|
||||
ADDR 2
|
||||
ADDRESS_PAT 4
|
||||
NAME 1
|
||||
NAME_PAT 3
|
||||
Options: extended
|
||||
Subject length lower bound = 3
|
||||
metcalfe 33
|
||||
0: metcalfe 33
|
||||
1: metcalfe
|
||||
2: 33
|
||||
|
||||
# Check for an error when different tables are used.
|
||||
|
||||
/abc/push,tables=1
|
||||
/xyz/push,tables=2
|
||||
#save testsaved1
|
||||
Serialization failed: error -30: patterns do not all use the same character tables
|
||||
|
||||
#pop
|
||||
xyz
|
||||
0: xyz
|
||||
|
||||
#pop
|
||||
abc
|
||||
0: abc
|
||||
|
||||
#pop should give an error
|
||||
** Can't pop off an empty stack
|
||||
pqr
|
||||
|
||||
# End of testinput19
|
||||
# End of testdata/testinput19
|
||||
|
3003
pcre2/testdata/testoutput2
vendored
3003
pcre2/testdata/testoutput2
vendored
File diff suppressed because it is too large
Load Diff
150
pcre2/testdata/testoutput20
vendored
Normal file
150
pcre2/testdata/testoutput20
vendored
Normal file
@ -0,0 +1,150 @@
|
||||
# This set of tests exercises the serialization/deserialization and code copy
|
||||
# functions in the library. It does not use UTF or JIT.
|
||||
|
||||
#forbid_utf
|
||||
|
||||
# Compile several patterns, push them onto the stack, and then write them
|
||||
# all to a file.
|
||||
|
||||
#pattern push
|
||||
|
||||
/(?<NAME>(?&NAME_PAT))\s+(?<ADDR>(?&ADDRESS_PAT))
|
||||
(?(DEFINE)
|
||||
(?<NAME_PAT>[a-z]+)
|
||||
(?<ADDRESS_PAT>\d+)
|
||||
)/x
|
||||
/^(?:((.)(?1)\2|)|((.)(?3)\4|.))$/i
|
||||
|
||||
#save testsaved1
|
||||
|
||||
# Do it again for some more patterns.
|
||||
|
||||
/(*MARK:A)(*SKIP:B)(C|X)/mark
|
||||
** Ignored when compiled pattern is stacked with 'push': mark
|
||||
/(?:(?<n>foo)|(?<n>bar))\k<n>/dupnames
|
||||
|
||||
#save testsaved2
|
||||
#pattern -push
|
||||
|
||||
# Reload the patterns, then pop them one by one and check them.
|
||||
|
||||
#load testsaved1
|
||||
#load testsaved2
|
||||
|
||||
#pop info
|
||||
Capturing subpattern count = 2
|
||||
Max back reference = 2
|
||||
Named capturing subpatterns:
|
||||
n 1
|
||||
n 2
|
||||
Options: dupnames
|
||||
Starting code units: b f
|
||||
Subject length lower bound = 6
|
||||
foofoo
|
||||
0: foofoo
|
||||
1: foo
|
||||
barbar
|
||||
0: barbar
|
||||
1: <unset>
|
||||
2: bar
|
||||
|
||||
#pop mark
|
||||
C
|
||||
0: C
|
||||
1: C
|
||||
MK: A
|
||||
\= Expect no match
|
||||
D
|
||||
No match, mark = A
|
||||
|
||||
#pop
|
||||
AmanaplanacanalPanama
|
||||
0: AmanaplanacanalPanama
|
||||
1: <unset>
|
||||
2: <unset>
|
||||
3: AmanaplanacanalPanama
|
||||
4: A
|
||||
|
||||
#pop info
|
||||
Capturing subpattern count = 4
|
||||
Named capturing subpatterns:
|
||||
ADDR 2
|
||||
ADDRESS_PAT 4
|
||||
NAME 1
|
||||
NAME_PAT 3
|
||||
Options: extended
|
||||
Subject length lower bound = 3
|
||||
metcalfe 33
|
||||
0: metcalfe 33
|
||||
1: metcalfe
|
||||
2: 33
|
||||
|
||||
# Check for an error when different tables are used.
|
||||
|
||||
/abc/push,tables=1
|
||||
/xyz/push,tables=2
|
||||
#save testsaved1
|
||||
Serialization failed: error -30: patterns do not all use the same character tables
|
||||
|
||||
#pop
|
||||
xyz
|
||||
0: xyz
|
||||
|
||||
#pop
|
||||
abc
|
||||
0: abc
|
||||
|
||||
#pop should give an error
|
||||
** Can't pop off an empty stack
|
||||
pqr
|
||||
|
||||
/abcd/pushcopy
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop should give an error
|
||||
** Can't pop off an empty stack
|
||||
|
||||
/abcd/push
|
||||
#popcopy
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
/abcd/push
|
||||
#save testsaved1
|
||||
#pop should give an error
|
||||
** Can't pop off an empty stack
|
||||
|
||||
#load testsaved1
|
||||
#popcopy
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop should give an error
|
||||
** Can't pop off an empty stack
|
||||
|
||||
/abcd/pushtablescopy
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#popcopy
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
#pop
|
||||
abcd
|
||||
0: abcd
|
||||
|
||||
# End of testinput20
|
94
pcre2/testdata/testoutput21
vendored
Normal file
94
pcre2/testdata/testoutput21
vendored
Normal file
@ -0,0 +1,94 @@
|
||||
# These are tests of \C that do not involve UTF. They are not run when \C is
|
||||
# disabled by compiling with --enable-never-backslash-C.
|
||||
|
||||
/\C+\D \C+\d \C+\S \C+\s \C+\W \C+\w \C+. \C+\R \C+\H \C+\h \C+\V \C+\v \C+\Z \C+\z \C+$/Bx
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
AllAny+
|
||||
\D
|
||||
AllAny+
|
||||
\d
|
||||
AllAny+
|
||||
\S
|
||||
AllAny+
|
||||
\s
|
||||
AllAny+
|
||||
\W
|
||||
AllAny+
|
||||
\w
|
||||
AllAny+
|
||||
Any
|
||||
AllAny+
|
||||
\R
|
||||
AllAny+
|
||||
\H
|
||||
AllAny+
|
||||
\h
|
||||
AllAny+
|
||||
\V
|
||||
AllAny+
|
||||
\v
|
||||
AllAny+
|
||||
\Z
|
||||
AllAny++
|
||||
\z
|
||||
AllAny+
|
||||
$
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\D+\C \d+\C \S+\C \s+\C \W+\C \w+\C .+\C \R+\C \H+\C \h+\C \V+\C \v+\C a+\C \n+\C \C+\C/Bx
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
\D+
|
||||
AllAny
|
||||
\d+
|
||||
AllAny
|
||||
\S+
|
||||
AllAny
|
||||
\s+
|
||||
AllAny
|
||||
\W+
|
||||
AllAny
|
||||
\w+
|
||||
AllAny
|
||||
Any+
|
||||
AllAny
|
||||
\R+
|
||||
AllAny
|
||||
\H+
|
||||
AllAny
|
||||
\h+
|
||||
AllAny
|
||||
\V+
|
||||
AllAny
|
||||
\v+
|
||||
AllAny
|
||||
a+
|
||||
AllAny
|
||||
\x0a+
|
||||
AllAny
|
||||
AllAny+
|
||||
AllAny
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/ab\Cde/never_backslash_c
|
||||
Failed: error 183 at offset 4: using \C is disabled by the application
|
||||
|
||||
/ab\Cde/info
|
||||
Capturing subpattern count = 0
|
||||
Contains \C
|
||||
First code unit = 'a'
|
||||
Last code unit = 'e'
|
||||
Subject length lower bound = 5
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
/(?<=ab\Cde)X/
|
||||
abZdeX
|
||||
0: X
|
||||
|
||||
# End of testinput21
|
169
pcre2/testdata/testoutput22-16
vendored
Normal file
169
pcre2/testdata/testoutput22-16
vendored
Normal file
@ -0,0 +1,169 @@
|
||||
# Tests of \C when Unicode support is available. Note that \C is not supported
|
||||
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
|
||||
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
|
||||
# in some widths and not in others.
|
||||
|
||||
/ab\Cde/utf,info
|
||||
Capturing subpattern count = 0
|
||||
Contains \C
|
||||
Options: utf
|
||||
First code unit = 'a'
|
||||
Last code unit = 'e'
|
||||
Subject length lower bound = 0
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
|
||||
# 16-bit modes, but not in 32-bit mode.
|
||||
|
||||
/(?<=ab\Cde)X/utf
|
||||
Failed: error 136 at offset 0: \C is not allowed in a lookbehind assertion in UTF-16 mode
|
||||
ab!deXYZ
|
||||
|
||||
# Autopossessification tests
|
||||
|
||||
/\C+\X \X+\C/Bx
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
AllAny+
|
||||
extuni
|
||||
extuni+
|
||||
AllAny
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C+\X \X+\C/Bx,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
Anybyte+
|
||||
extuni
|
||||
extuni+
|
||||
Anybyte
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C\X*TӅ;
|
||||
{0,6}\v+
|
||||
F
|
||||
/utf
|
||||
\= Expect no match
|
||||
Ӆ\x0a
|
||||
No match
|
||||
|
||||
/\C(\W?ſ)'?{{/utf
|
||||
\= Expect no match
|
||||
\\C(\\W?ſ)'?{{
|
||||
No match
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
1: \x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}Y
|
||||
1: \x{11234}Y
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
X\x{11234}YZW
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
No match
|
||||
X\x{1234}\x{512}YZ
|
||||
0: X\x{1234}\x{512}YZ
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
0: X\x{11234}\x{512}
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
0: X\x{11234}\x{512}\x{11234}
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
No match
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}
|
||||
X\x{11234}
|
||||
No match
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
No match
|
||||
a\x{12257}b
|
||||
0: a\x{12257}b
|
||||
a\x{12257}\x{11234}b
|
||||
No match
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single code unit is displayed. (In Perl it becomes a character,
|
||||
# and you can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
2:
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
\= Expect no match in 8-bit mode
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
167
pcre2/testdata/testoutput22-32
vendored
Normal file
167
pcre2/testdata/testoutput22-32
vendored
Normal file
@ -0,0 +1,167 @@
|
||||
# Tests of \C when Unicode support is available. Note that \C is not supported
|
||||
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
|
||||
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
|
||||
# in some widths and not in others.
|
||||
|
||||
/ab\Cde/utf,info
|
||||
Capturing subpattern count = 0
|
||||
Contains \C
|
||||
Options: utf
|
||||
First code unit = 'a'
|
||||
Last code unit = 'e'
|
||||
Subject length lower bound = 5
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
|
||||
# 16-bit modes, but not in 32-bit mode.
|
||||
|
||||
/(?<=ab\Cde)X/utf
|
||||
ab!deXYZ
|
||||
0: X
|
||||
|
||||
# Autopossessification tests
|
||||
|
||||
/\C+\X \X+\C/Bx
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
AllAny+
|
||||
extuni
|
||||
extuni+
|
||||
AllAny
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C+\X \X+\C/Bx,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
AllAny+
|
||||
extuni
|
||||
extuni+
|
||||
AllAny
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C\X*TӅ;
|
||||
{0,6}\v+
|
||||
F
|
||||
/utf
|
||||
\= Expect no match
|
||||
Ӆ\x0a
|
||||
No match
|
||||
|
||||
/\C(\W?ſ)'?{{/utf
|
||||
\= Expect no match
|
||||
\\C(\\W?ſ)'?{{
|
||||
No match
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
1: \x{11234}YZ
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
No match
|
||||
X\x{11234}YZW
|
||||
0: X\x{11234}YZW
|
||||
1: \x{11234}YZW
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
No match
|
||||
X\x{1234}\x{512}YZ
|
||||
0: X\x{1234}\x{512}YZ
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}
|
||||
No match
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}YZ
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
0: X\x{11234}\x{512}\x{11234}Z
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{1234}
|
||||
No match
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
No match
|
||||
X\x{11234}Y
|
||||
No match
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}YZ
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{512}Y
|
||||
X\x{11234}
|
||||
No match
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
No match
|
||||
a\x{12257}b
|
||||
No match
|
||||
a\x{12257}\x{11234}b
|
||||
0: a\x{12257}\x{11234}b
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single code unit is displayed. (In Perl it becomes a character,
|
||||
# and you can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
2:
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
\= Expect no match in 8-bit mode
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
|
171
pcre2/testdata/testoutput22-8
vendored
Normal file
171
pcre2/testdata/testoutput22-8
vendored
Normal file
@ -0,0 +1,171 @@
|
||||
# Tests of \C when Unicode support is available. Note that \C is not supported
|
||||
# for DFA matching in UTF mode, so this test is not run with -dfa. The output
|
||||
# of this test is different in 8-, 16-, and 32-bit modes. Some tests may match
|
||||
# in some widths and not in others.
|
||||
|
||||
/ab\Cde/utf,info
|
||||
Capturing subpattern count = 0
|
||||
Contains \C
|
||||
Options: utf
|
||||
First code unit = 'a'
|
||||
Last code unit = 'e'
|
||||
Subject length lower bound = 0
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This should produce an error diagnostic (\C in UTF lookbehind) in 8-bit and
|
||||
# 16-bit modes, but not in 32-bit mode.
|
||||
|
||||
/(?<=ab\Cde)X/utf
|
||||
Failed: error 136 at offset 0: \C is not allowed in a lookbehind assertion in UTF-8 mode
|
||||
ab!deXYZ
|
||||
|
||||
# Autopossessification tests
|
||||
|
||||
/\C+\X \X+\C/Bx
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
AllAny+
|
||||
extuni
|
||||
extuni+
|
||||
AllAny
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C+\X \X+\C/Bx,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
Anybyte+
|
||||
extuni
|
||||
extuni+
|
||||
Anybyte
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\C\X*TӅ;
|
||||
{0,6}\v+
|
||||
F
|
||||
/utf
|
||||
\= Expect no match
|
||||
Ӆ\x0a
|
||||
No match
|
||||
|
||||
/\C(\W?ſ)'?{{/utf
|
||||
\= Expect no match
|
||||
\\C(\\W?ſ)'?{{
|
||||
No match
|
||||
|
||||
/X(\C{3})/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{1234}
|
||||
X\x{11234}Y
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
1: \x{f0}\x{91}\x{88}
|
||||
X\x{11234}YZ
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
1: \x{f0}\x{91}\x{88}
|
||||
|
||||
/X(\C{4})/utf
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}Y
|
||||
1: \x{1234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}
|
||||
1: \x{11234}
|
||||
X\x{11234}YZW
|
||||
0: X\x{11234}
|
||||
1: \x{11234}
|
||||
|
||||
/X\C*/utf
|
||||
XYZabcdce
|
||||
0: XYZabcdce
|
||||
|
||||
/X\C*?/utf
|
||||
XYZabcde
|
||||
0: X
|
||||
|
||||
/X\C{3,5}/utf
|
||||
Xabcdefg
|
||||
0: Xabcde
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}YZ
|
||||
X\x{1234}\x{512}
|
||||
0: X\x{1234}\x{512}
|
||||
X\x{1234}\x{512}YZ
|
||||
0: X\x{1234}\x{512}
|
||||
X\x{11234}Y
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}YZ
|
||||
0: X\x{11234}Y
|
||||
X\x{11234}\x{512}
|
||||
0: X\x{11234}\x{d4}
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{11234}\x{d4}
|
||||
X\x{11234}\x{512}\x{11234}Z
|
||||
0: X\x{11234}\x{d4}
|
||||
|
||||
/X\C{3,5}?/utf
|
||||
Xabcdefg
|
||||
0: Xabc
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
X\x{1234}YZ
|
||||
0: X\x{1234}
|
||||
X\x{1234}\x{512}
|
||||
0: X\x{1234}
|
||||
X\x{11234}Y
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
X\x{11234}YZ
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
X\x{11234}\x{512}YZ
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
X\x{11234}
|
||||
0: X\x{f0}\x{91}\x{88}
|
||||
|
||||
/a\Cb/utf
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
a\x{100}b
|
||||
No match
|
||||
|
||||
/a\C\Cb/utf
|
||||
a\x{100}b
|
||||
0: a\x{100}b
|
||||
a\x{12257}b
|
||||
No match
|
||||
a\x{12257}\x{11234}b
|
||||
No match
|
||||
|
||||
/ab\Cde/utf
|
||||
abXde
|
||||
0: abXde
|
||||
|
||||
# This one is here not because it's different to Perl, but because the way
|
||||
# the captured single code unit is displayed. (In Perl it becomes a character,
|
||||
# and you can't tell the difference.)
|
||||
|
||||
/X(\C)(.*)/utf
|
||||
X\x{1234}
|
||||
0: X\x{1234}
|
||||
1: \x{e1}
|
||||
2: \x{88}\x{b4}
|
||||
X\nabc
|
||||
0: X\x{0a}abc
|
||||
1: \x{0a}
|
||||
2: abc
|
||||
|
||||
# This one is here because Perl gives out a grumbly error message (quite
|
||||
# correctly, but that messes up comparisons).
|
||||
|
||||
/a\Cb/utf
|
||||
\= Expect no match in 8-bit mode
|
||||
a\x{100}b
|
||||
No match
|
||||
|
8
pcre2/testdata/testoutput23
vendored
Normal file
8
pcre2/testdata/testoutput23
vendored
Normal file
@ -0,0 +1,8 @@
|
||||
# This test is run when PCRE2 has been built with --enable-never-backslash-C,
|
||||
# which disables the use of \C. All we can do is check that it gives the
|
||||
# correct error message.
|
||||
|
||||
/a\Cb/
|
||||
Failed: error 185 at offset 3: using \C is disabled in this PCRE2 library
|
||||
|
||||
# End of testinput23
|
29
pcre2/testdata/testoutput3
vendored
29
pcre2/testdata/testoutput3
vendored
@ -8,8 +8,7 @@
|
||||
#forbid_utf
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -18,8 +17,7 @@ No match
|
||||
0: �cole
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -28,30 +26,26 @@ No match
|
||||
0: \xc9
|
||||
|
||||
/^[\W]+/locale=fr_FR
|
||||
*** Failers
|
||||
0: ***
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/[\b]/
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/[\b]/locale=fr_FR
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/^\w+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -66,18 +60,14 @@ No match
|
||||
2: cole
|
||||
|
||||
/(.+)\b(.+)/locale=fr_FR
|
||||
*** Failers
|
||||
0: *** Failers
|
||||
1: ***
|
||||
2: Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/�cole/i
|
||||
�cole
|
||||
0: \xc9cole
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -114,8 +104,7 @@ Subject length lower bound = 1
|
||||
/^[\xc8-\xc9]/
|
||||
�cole
|
||||
0: �
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
|
29
pcre2/testdata/testoutput3A
vendored
29
pcre2/testdata/testoutput3A
vendored
@ -8,8 +8,7 @@
|
||||
#forbid_utf
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -18,8 +17,7 @@ No match
|
||||
0: �cole
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -28,30 +26,26 @@ No match
|
||||
0: \xc9
|
||||
|
||||
/^[\W]+/locale=fr_FR
|
||||
*** Failers
|
||||
0: ***
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/[\b]/
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/[\b]/locale=fr_FR
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/^\w+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -66,18 +60,14 @@ No match
|
||||
2: cole
|
||||
|
||||
/(.+)\b(.+)/locale=fr_FR
|
||||
*** Failers
|
||||
0: *** Failers
|
||||
1: ***
|
||||
2: Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/�cole/i
|
||||
�cole
|
||||
0: \xc9cole
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -114,8 +104,7 @@ Subject length lower bound = 1
|
||||
/^[\xc8-\xc9]/
|
||||
�cole
|
||||
0: �
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
|
29
pcre2/testdata/testoutput3B
vendored
29
pcre2/testdata/testoutput3B
vendored
@ -8,8 +8,7 @@
|
||||
#forbid_utf
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -18,8 +17,7 @@ No match
|
||||
0: �cole
|
||||
|
||||
/^[\w]+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -28,30 +26,26 @@ No match
|
||||
0: \xc9
|
||||
|
||||
/^[\W]+/locale=fr_FR
|
||||
*** Failers
|
||||
0: ***
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/[\b]/
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/[\b]/locale=fr_FR
|
||||
\b
|
||||
0: \x08
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a
|
||||
No match
|
||||
|
||||
/^\w+/
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -66,18 +60,14 @@ No match
|
||||
2: cole
|
||||
|
||||
/(.+)\b(.+)/locale=fr_FR
|
||||
*** Failers
|
||||
0: *** Failers
|
||||
1: ***
|
||||
2: Failers
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
/�cole/i
|
||||
�cole
|
||||
0: \xc9cole
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
@ -114,8 +104,7 @@ Subject length lower bound = 1
|
||||
/^[\xc8-\xc9]/
|
||||
�cole
|
||||
0: �
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
�cole
|
||||
No match
|
||||
|
||||
|
732
pcre2/testdata/testoutput4
vendored
732
pcre2/testdata/testoutput4
vendored
File diff suppressed because it is too large
Load Diff
434
pcre2/testdata/testoutput5
vendored
434
pcre2/testdata/testoutput5
vendored
@ -3,6 +3,8 @@
|
||||
# results in 8-bit, 16-bit, and 32-bit modes are excluded (see tests 10 and
|
||||
# 12).
|
||||
|
||||
#newline_default lf any anycrlf
|
||||
|
||||
# PCRE2 and Perl disagree about the characteristics of certain Unicode
|
||||
# characters. For example, 061C is considered by Perl to be Arabic, though
|
||||
# is it not listed as such in the Unicode Scripts.txt file, and 2066-2069 are
|
||||
@ -11,14 +13,12 @@
|
||||
# test 4.
|
||||
|
||||
/^[\p{Arabic}]/utf
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{061c}
|
||||
No match
|
||||
|
||||
/^[[:graph:]]+$/utf,ucp
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{61c}
|
||||
No match
|
||||
\x{2066}
|
||||
@ -31,8 +31,7 @@ No match
|
||||
No match
|
||||
|
||||
/^[[:print:]]+$/utf,ucp
|
||||
** Failers
|
||||
0: ** Failers
|
||||
\= Expect no match
|
||||
\x{61c}
|
||||
No match
|
||||
\x{2066}
|
||||
@ -76,6 +75,7 @@ No match
|
||||
0: A\x{85}\x{2005}Z
|
||||
|
||||
/^[[:graph:]]+$/utf,ucp
|
||||
\= Expect no match
|
||||
\x{180e}
|
||||
No match
|
||||
|
||||
@ -88,6 +88,7 @@ No match
|
||||
0: \x{09}\x{0a}\x{1d} \x{85}\x{a0}\x{61c}\x{1680}\x{180e}
|
||||
|
||||
/^[[:^print:]]+$/utf,ucp
|
||||
\= Expect no match
|
||||
\x{180e}
|
||||
No match
|
||||
|
||||
@ -182,10 +183,6 @@ Subject length lower bound = 3
|
||||
\x{212ab}\x{212ab}\x{212ab}\x{861}
|
||||
0: \x{212ab}\x{212ab}\x{212ab}
|
||||
|
||||
/(?<=\C)X/utf
|
||||
Failed: error 136 at offset 6: \C is not allowed in a lookbehind assertion
|
||||
Should produce an error diagnostic
|
||||
|
||||
/^[ab]/IB,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
@ -200,8 +197,7 @@ Overall options: anchored utf
|
||||
Subject length lower bound = 1
|
||||
bar
|
||||
0: b
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
c
|
||||
No match
|
||||
\x{ff}
|
||||
@ -227,8 +223,7 @@ Subject length lower bound = 1
|
||||
0: \x{ff}
|
||||
\x{100}
|
||||
0: \x{100}
|
||||
*** Failers
|
||||
0: *
|
||||
\= Expect no match
|
||||
aaa
|
||||
No match
|
||||
|
||||
@ -251,8 +246,7 @@ No match
|
||||
\x{100}\x{100}"12"
|
||||
0: \x{100}\x{100}"12"
|
||||
1: "12"
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{100}\x{100}abcd
|
||||
No match
|
||||
|
||||
@ -303,8 +297,7 @@ Failed: error 108 at offset 15: range out of order in character class
|
||||
0: \x{100}
|
||||
\x{104}
|
||||
0: \x{104}
|
||||
*** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{105}
|
||||
No match
|
||||
\x{ff}
|
||||
@ -581,8 +574,7 @@ Matched, but too many substrings
|
||||
0: a\x{2028}b
|
||||
a\x{2029}b
|
||||
0: a\x{2029}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a\n\rb
|
||||
No match
|
||||
|
||||
@ -623,8 +615,7 @@ No match
|
||||
0: a\x{0a}\x{0d}b
|
||||
a\n\r\x{85}\x0cb
|
||||
0: a\x{0a}\x{0d}\x{85}\x{0c}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
ab
|
||||
No match
|
||||
|
||||
@ -643,8 +634,7 @@ No match
|
||||
0: a\x{0a}\x{0d}\x{0a}\x{0d}b
|
||||
a\n\n\r\nb
|
||||
0: a\x{0a}\x{0a}\x{0d}\x{0a}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a\n\n\n\rb
|
||||
No match
|
||||
a\r
|
||||
@ -655,8 +645,7 @@ No match
|
||||
0: X X\x{0a}
|
||||
X\x09X\x0b
|
||||
0: X\x{09}X\x{0b}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{a0} X\x0a
|
||||
No match
|
||||
|
||||
@ -667,8 +656,7 @@ No match
|
||||
0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}\x{0d}
|
||||
\x09\x20\x{a0}\x0a\x0b\x0c
|
||||
0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x09\x20\x{a0}\x0a\x0b
|
||||
No match
|
||||
|
||||
@ -677,8 +665,7 @@ No match
|
||||
0: \x{3001}\x{3000}\x{2030}\x{2028}
|
||||
X\x{180e}X\x{85}
|
||||
0: X\x{180e}X\x{85}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x{2009} X\x0a
|
||||
No match
|
||||
|
||||
@ -689,8 +676,7 @@ No match
|
||||
0: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}\x{2028}
|
||||
\x09\x20\x{202f}\x0a\x0b\x0c
|
||||
0: \x{09} \x{202f}\x{0a}\x{0b}\x{0c}
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
\x09\x{200a}\x{a0}\x{2028}\x0b
|
||||
No match
|
||||
|
||||
@ -755,8 +741,7 @@ Subject length lower bound = 3
|
||||
0: a\x{0a}b
|
||||
a\r\nb
|
||||
0: a\x{0d}\x{0a}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a\x{85}b
|
||||
No match
|
||||
a\x0bb
|
||||
@ -793,8 +778,7 @@ Subject length lower bound = 2
|
||||
0: a\x{0a}b
|
||||
a\r\nb
|
||||
0: a\x{0d}\x{0a}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a\x{85}b
|
||||
No match
|
||||
a\x0bb
|
||||
@ -817,14 +801,11 @@ Subject length lower bound = 2
|
||||
0: a\x{85}b
|
||||
a\x0bb
|
||||
0: a\x{0b}b
|
||||
** Failers
|
||||
No match
|
||||
|
||||
/.*a.*=.b.*/utf,newline=any
|
||||
QQQ\x{2029}ABCaXYZ=!bPQR
|
||||
0: ABCaXYZ=!bPQR
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
a\x{2029}b
|
||||
No match
|
||||
\x61\xe2\x80\xa9\x62
|
||||
@ -838,8 +819,7 @@ Failed: error 130 at offset 3: unknown POSIX class name
|
||||
0: a\x{1234}b
|
||||
a\nb
|
||||
0: a\x{0a}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
ab
|
||||
No match
|
||||
|
||||
@ -848,8 +828,7 @@ No match
|
||||
0: aXb
|
||||
a\nX\nX\x{1234}b
|
||||
0: a\x{0a}X\x{0a}X\x{1234}b
|
||||
** Failers
|
||||
No match
|
||||
\= Expect no match
|
||||
ab
|
||||
No match
|
||||
|
||||
@ -935,6 +914,7 @@ Partial match: X\x{123}\x{123}\x{123}
|
||||
Partial match: X\x{123}\x{123}\x{123}\x{123}
|
||||
|
||||
/X\x{123}{2,4}b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
No match
|
||||
X\x{123}x\=ps
|
||||
@ -947,6 +927,7 @@ No match
|
||||
No match
|
||||
|
||||
/X\x{123}{2,4}?b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
No match
|
||||
X\x{123}x\=ps
|
||||
@ -959,6 +940,7 @@ No match
|
||||
No match
|
||||
|
||||
/X\x{123}{2,4}+b/utf
|
||||
\= Expect no match
|
||||
Xx\=ps
|
||||
No match
|
||||
X\x{123}x\=ps
|
||||
@ -1745,6 +1727,7 @@ Last code unit = 'y'
|
||||
First code unit = 'x'
|
||||
Last code unit = 'y'
|
||||
Subject length lower bound = 2
|
||||
|
||||
/(?<!^)ETA/utf
|
||||
\= Expect no match
|
||||
ETA
|
||||
@ -1765,7 +1748,7 @@ No match
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
|
||||
/\ud800/utf,alt_bsux,allow_empty_class,match_unset_backref
|
||||
Failed: error 173 at offset 6: disallowed Unicode code point (>= 0xd800 && <= 0xdfff)
|
||||
|
||||
@ -1874,8 +1857,7 @@ Subject length lower bound = 1
|
||||
0: 1234
|
||||
12-34
|
||||
0: 12-34
|
||||
12+\x{661}-34
|
||||
0: 12+\x{661}-34
|
||||
12+\x{661}-34
|
||||
0: 12+\x{661}-34
|
||||
\= Expect no match
|
||||
abcd
|
||||
@ -1995,8 +1977,7 @@ No match
|
||||
0: \x{2069}
|
||||
|
||||
/^\p{Cs}/utf
|
||||
\x{dfff}\=no_utf_check
|
||||
0: \x{dfff}
|
||||
\x{dfff}\=no_utf_check
|
||||
0: \x{dfff}
|
||||
\= Expect no match
|
||||
\x{09f}
|
||||
@ -2021,8 +2002,7 @@ No match
|
||||
/^\p{Sc}+/utf
|
||||
$\x{a2}\x{a3}\x{a4}\x{a5}\x{a6}
|
||||
0: $\x{a2}\x{a3}\x{a4}\x{a5}
|
||||
\x{9f2}
|
||||
0: \x{9f2}
|
||||
\x{9f2}
|
||||
0: \x{9f2}
|
||||
\= Expect no match
|
||||
X
|
||||
@ -2039,8 +2019,7 @@ No match
|
||||
0: \x{1680}
|
||||
\x{2000}
|
||||
0: \x{2000}
|
||||
\x{2001}
|
||||
0: \x{2001}
|
||||
\x{2001}
|
||||
0: \x{2001}
|
||||
\= Expect no match
|
||||
\x{2028}
|
||||
@ -2052,16 +2031,14 @@ No match
|
||||
# properties and has changed how it behaves for caseless matching.
|
||||
|
||||
/\p{^Lu}/i,utf
|
||||
1234
|
||||
0: 1
|
||||
1234
|
||||
0: 1
|
||||
\= Expect no match
|
||||
ABC
|
||||
No match
|
||||
|
||||
/\P{Lu}/i,utf
|
||||
1234
|
||||
0: 1
|
||||
1234
|
||||
0: 1
|
||||
\= Expect no match
|
||||
ABC
|
||||
@ -2070,8 +2047,7 @@ No match
|
||||
/\p{Ll}/i,utf
|
||||
a
|
||||
0: a
|
||||
Az
|
||||
0: z
|
||||
Az
|
||||
0: z
|
||||
\= Expect no match
|
||||
ABC
|
||||
@ -2080,8 +2056,7 @@ No match
|
||||
/\p{Lu}/i,utf
|
||||
A
|
||||
0: A
|
||||
a\x{10a0}B
|
||||
0: \x{10a0}
|
||||
a\x{10a0}B
|
||||
0: \x{10a0}
|
||||
\= Expect no match
|
||||
a
|
||||
@ -2092,8 +2067,7 @@ No match
|
||||
/\p{Lu}/i,utf
|
||||
A
|
||||
0: A
|
||||
aZ
|
||||
0: Z
|
||||
aZ
|
||||
0: Z
|
||||
\= Expect no match
|
||||
abc
|
||||
@ -2182,16 +2156,14 @@ No match
|
||||
0: \x{6ca}
|
||||
\x{a6c}
|
||||
0: \x{a6c}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
No match
|
||||
|
||||
/^\p{Xan}+/utf
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
@ -2222,16 +2194,14 @@ No match
|
||||
0: \x{6ca}
|
||||
\x{a6c}
|
||||
0: \x{a6c}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
No match
|
||||
|
||||
/^[\p{Xan}]+/utf
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
|
||||
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
|
||||
0: ABCD1234\x{6ca}\x{a6c}\x{10a7}
|
||||
\= Expect no match
|
||||
_ABC
|
||||
@ -2240,8 +2210,7 @@ No match
|
||||
/^>\p{Xsp}/utf
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
0: >\x{1680}
|
||||
>\x{a0}
|
||||
0: >\x{a0}
|
||||
>\x{a0}
|
||||
0: >\x{a0}
|
||||
\= Expect no match
|
||||
\x{0b}
|
||||
@ -2278,8 +2247,7 @@ No match
|
||||
/^>\p{Xps}/utf
|
||||
>\x{1680}\x{2028}\x{0b}
|
||||
0: >\x{1680}
|
||||
>\x{a0}
|
||||
0: >\x{a0}
|
||||
>\x{a0}
|
||||
0: >\x{a0}
|
||||
\= Expect no match
|
||||
\x{0b}
|
||||
@ -2324,8 +2292,7 @@ No match
|
||||
0: \x{a6c}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
_ABC
|
||||
0: _
|
||||
_ABC
|
||||
0: _
|
||||
\= Expect no match
|
||||
[]
|
||||
@ -2362,8 +2329,7 @@ No match
|
||||
0: \x{a6c}
|
||||
\x{10a7}
|
||||
0: \x{10a7}
|
||||
_ABC
|
||||
0: _
|
||||
_ABC
|
||||
0: _
|
||||
\= Expect no match
|
||||
[]
|
||||
@ -2630,8 +2596,7 @@ No match
|
||||
# Without PCRE_UCP, non-ASCII always fail, even if < 256
|
||||
|
||||
/\b...\B/utf
|
||||
abc_
|
||||
0: abc
|
||||
abc_
|
||||
0: abc
|
||||
\= Expect no match
|
||||
\x{37e}abc\x{376}
|
||||
@ -2825,10 +2790,12 @@ No match
|
||||
------------------------------------------------------------------
|
||||
|
||||
# These behaved oddly in Perl, so they are kept in this test
|
||||
|
||||
/(\x{23a}\x{23a}\x{23a})?\1/i,utf
|
||||
\= Expect no match
|
||||
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
|
||||
No match
|
||||
|
||||
/(ȺȺȺ)?\1/i,utf
|
||||
\= Expect no match
|
||||
ȺȺȺⱥⱥ
|
||||
@ -2843,10 +2810,12 @@ No match
|
||||
ȺȺȺⱥⱥⱥ
|
||||
0: \x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}\x{2c65}
|
||||
1: \x{23a}\x{23a}\x{23a}
|
||||
|
||||
/(\x{23a}\x{23a}\x{23a})\1/i,utf
|
||||
\= Expect no match
|
||||
\x{23a}\x{23a}\x{23a}\x{2c65}\x{2c65}
|
||||
No match
|
||||
|
||||
/(ȺȺȺ)\1/i,utf
|
||||
\= Expect no match
|
||||
ȺȺȺⱥⱥ
|
||||
@ -2887,8 +2856,7 @@ No match
|
||||
/^[\p{Batak}]/utf
|
||||
\x{1bc0}
|
||||
0: \x{1bc0}
|
||||
\x{1bff}
|
||||
0: \x{1bff}
|
||||
\x{1bff}
|
||||
0: \x{1bff}
|
||||
\= Expect no match
|
||||
\x{1bf4}
|
||||
@ -2897,8 +2865,7 @@ No match
|
||||
/^[\p{Brahmi}]/utf
|
||||
\x{11000}
|
||||
0: \x{11000}
|
||||
\x{1106f}
|
||||
0: \x{1106f}
|
||||
\x{1106f}
|
||||
0: \x{1106f}
|
||||
\= Expect no match
|
||||
\x{1104e}
|
||||
@ -2907,8 +2874,7 @@ No match
|
||||
/^[\p{Mandaic}]/utf
|
||||
\x{840}
|
||||
0: \x{840}
|
||||
\x{85e}
|
||||
0: \x{85e}
|
||||
\x{85e}
|
||||
0: \x{85e}
|
||||
\= Expect no match
|
||||
\x{85c}
|
||||
@ -2933,14 +2899,10 @@ No match
|
||||
0: \x{301}
|
||||
|
||||
/^a\X41z/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
aX41z
|
||||
0: aX41z
|
||||
aX41z
|
||||
0: aX41z
|
||||
\= Expect no match
|
||||
aAz
|
||||
No match
|
||||
|
||||
/(?<=ab\Cde)X/utf
|
||||
No match
|
||||
|
||||
/\X/
|
||||
@ -3138,8 +3100,7 @@ Subject length lower bound = 3
|
||||
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
|
||||
0: \x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
|
||||
0+
|
||||
|
||||
/\x{3a3}++./i,utf,aftertext
|
||||
|
||||
/\x{3a3}++./i,utf,aftertext
|
||||
\= Expect no match
|
||||
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
|
||||
@ -3179,24 +3140,29 @@ No match
|
||||
clist 0053 0073 017f
|
||||
/i t
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
\= Expect no match
|
||||
ikt
|
||||
No match
|
||||
|
||||
/is+t/i,utf
|
||||
iSs\x{17f}t
|
||||
0: iSs\x{17f}t
|
||||
\= Expect no match
|
||||
ikt
|
||||
No match
|
||||
|
||||
/is+?t/i,utf
|
||||
\= Expect no match
|
||||
ikt
|
||||
No match
|
||||
|
||||
/is?t/i,utf
|
||||
\= Expect no match
|
||||
ikt
|
||||
No match
|
||||
|
||||
/is{2}t/i,utf
|
||||
\= Expect no match
|
||||
iskt
|
||||
@ -3211,80 +3177,70 @@ No match
|
||||
0: @
|
||||
`abc
|
||||
0: `
|
||||
\x{1234}abc
|
||||
0: \x{1234}
|
||||
\x{1234}abc
|
||||
0: \x{1234}
|
||||
\= Expect no match
|
||||
abc
|
||||
No match
|
||||
|
||||
/^\p{Xuc}+/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\p{Xuc}+?/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\p{Xuc}+?\*/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}*
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}*
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\p{Xuc}++/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\p{Xuc}{3,5}/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\p{Xuc}{3,5}?/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^[\p{Xuc}]/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^[\p{Xuc}]+/utf
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
$@`\x{a0}\x{1234}\x{e000}**
|
||||
0: $@`\x{a0}\x{1234}\x{e000}
|
||||
\= Expect no match
|
||||
\x{9f}
|
||||
No match
|
||||
|
||||
/^\P{Xuc}/utf
|
||||
abc
|
||||
0: a
|
||||
abc
|
||||
0: a
|
||||
\= Expect no match
|
||||
$abc
|
||||
@ -3297,8 +3253,7 @@ No match
|
||||
No match
|
||||
|
||||
/^[\P{Xuc}]/utf
|
||||
abc
|
||||
0: a
|
||||
abc
|
||||
0: a
|
||||
\= Expect no match
|
||||
$abc
|
||||
@ -3843,7 +3798,7 @@ No match
|
||||
[ab\p{L}]{2,3}+
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\D+\X \d+\X \S+\X \s+\X \W+\X \w+\X \R+\X \H+\X \h+\X \V+\X \v+\X a+\X \n+\X .+\X/Bx
|
||||
------------------------------------------------------------------
|
||||
@ -3858,8 +3813,6 @@ No match
|
||||
extuni
|
||||
\W+
|
||||
extuni
|
||||
\w+
|
||||
extuni
|
||||
\w+
|
||||
extuni
|
||||
\R+
|
||||
@ -3898,7 +3851,7 @@ No match
|
||||
/m $
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/Bx
|
||||
------------------------------------------------------------------
|
||||
@ -3916,8 +3869,6 @@ No match
|
||||
extuni+
|
||||
\w
|
||||
extuni+
|
||||
Any
|
||||
extuni+
|
||||
Any
|
||||
extuni+
|
||||
\R
|
||||
@ -4003,12 +3954,9 @@ Subject length lower bound = 1
|
||||
/ábc/utf,replace=XሴZ
|
||||
123ábc123
|
||||
1: 123X\x{1234}Z123
|
||||
|
||||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>
|
||||
123abcáyzabcdef789abcሴqr
|
||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||
|
||||
/[^\xff]((?1))/utf,debug
|
||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||
|
||||
/[A-`]/iB,utf
|
||||
@ -4050,4 +3998,238 @@ Failed: error 122 at offset 1227: unmatched closing parenthesis
|
||||
"\xa\xf<(.\pZ*\P{Xwd}+^\xa8\3'3yq.::?(?J:()\xd1+!~:3'(8?:)':(?'d'(?'d'^u]!.+.+\\A\Ah(n+?9){7}+\K;(?'X'u'(?'c'(?'z'(?<y>\xb::\xf0'|\xd3(\xae?'w(z\x8?P>l)\x8?P>a)'\H\R\xd1+!!~:3'(?:h$N{26875}\W+?\\=D{2}\x89(?i:Uy0\N({2\xa(\v\x85*){y*\A(()\p{L}+?\P{^Xan}'+?\xff\+pS\?|).{;y*\A(()\p{L}+?\8}\d?1(|)(/1){7}.+[Lp{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(\xbf(R))\x8a\X*?\x8a\xb\xd1^9\3*+(\xc1,\k'R'\xb4)\xcc(z\z(?J)(?'X'\x1b(\xb\xd1^9\?'3*+P{^Xan}+?\xff\+(\xc1.]k+\xb'Pm'\xb4)\xcc4f\xa7'\xd1V(?i:U,{2,2})'(?'X'))?-%--\x95$9*\4'|\xd1(\x9c''%\x94$9)#(?'R')3\x7?('P\xed7'\xa8\xb1^u\xeaw\1\0\0\(|(?1){7}.+[\p{Me}].\s\xdcC*^\x14?(?(<y>))(?<!^)$C((;*?(R*?))+(?(R)\x8a\X*?\x8a\xb\xd1^9\3*+|(\xc1,\k'R'\xb4)\xcc! z)\z(?JJ)(?'X';(\xb\xd1^9\?'3*+(\xc1.]k+\xb'Pm'\xb4))':(?'d')(?'RD'(d')|)|$)'|(?<x>\g{d});\g{x}\x11\g{d}\x81\|$((?'X'\'X'(?'W''\x92()'9'\x83*))\xba*\!?^ <){)':;\xcc4'\xd1'(?'X'28))?-%--\x95$9*\4'|\xd1((''e\x94*$9:)*#(?'R')3)\x7?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+0!~:(?)'d'E:yD!\s(?'R'\x1e;\x10:U))|'\x9g!\xb0*){)\\x16:;()\x1e\x10\x87*:(?<y>)\xd1+!~:(?)'}'\d'E:yD!\s(?'R'\x1e;\x10:U))|'))|)g!\xb0*R+9{29+)#(?'P'})*?pS\{3,}\x85,{0,}l{*UTF)(\xe{7}){3722,{9,}d{2,?|))|{)\(A?&d}}{\xa,}2}){3,}7,l{)22}(,}l:7{2,4}}29\x19+)#?'P'})*v?))\x5"
|
||||
Failed: error 122 at offset 1227: unmatched closing parenthesis
|
||||
|
||||
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?'X'8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?'X'))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
||||
|
||||
"(*UTF)(*UCP)(.UTF).+X(\V+;\^(\D|)!999}(?(?C{7(?C')\H*\S*/^\x5\xa\\xd3\x85n?(;\D*(?m).[^mH+((*UCP)(*U:F)})(?!^)(?'"
|
||||
Failed: error 162 at offset 113: subpattern name expected
|
||||
|
||||
/[\pS#moq]/
|
||||
=
|
||||
0: =
|
||||
|
||||
/(*:a\x{12345}b\t(d\)c)xxx/utf,alt_verbnames,mark
|
||||
cxxxz
|
||||
0: xxx
|
||||
MK: a\x{12345}b\x{09}(d)c
|
||||
|
||||
/abcd/utf,replace=x\x{824}y\o{3333}z(\Q12\$34$$\x34\E5$$),substitute_extended
|
||||
abcd
|
||||
1: x\x{824}y\x{6db}z(12\$34$$\x345$)
|
||||
|
||||
/a(\x{e0}\x{101})(\x{c0}\x{102})/utf,replace=a\u$1\U$1\E$1\l$2\L$2\Eab\U\x{e0}\x{101}\L\x{d0}\x{160}\EDone,substitute_extended
|
||||
a\x{e0}\x{101}\x{c0}\x{102}
|
||||
1: a\x{c0}\x{101}\x{c0}\x{100}\x{e0}\x{101}\x{e0}\x{102}\x{e0}\x{103}ab\x{c0}\x{100}\x{f0}\x{161}Done
|
||||
|
||||
/((?<digit>\d)|(?<letter>\p{L}))/g,substitute_extended,replace=<${digit:+digit; :not digit; }${letter:+letter:not a letter}>
|
||||
ab12cde
|
||||
7: <not digit; letter><not digit; letter><digit; not a letter><digit; not a letter><not digit; letter><not digit; letter><not digit; letter>
|
||||
|
||||
/(*UCP)(*UTF)[[:>:]]X/B
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
\b
|
||||
AssertB
|
||||
Reverse
|
||||
prop Xwd
|
||||
Ket
|
||||
X
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/abc/utf,replace=xyz
|
||||
abc\=zero_terminate
|
||||
1: xyz
|
||||
|
||||
/a[[:punct:]b]/ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
a
|
||||
[b[:punct:]]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/a[[:punct:]b]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
a
|
||||
[b[:punct:]]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/a[b[:punct:]]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
a
|
||||
[b[:punct:]]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[[:^ascii:]]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff] (neg)
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[[:^ascii:]\w]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff\p{Xwd}\x{100}-\x{10ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[\w[:^ascii:]]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x80-\xff\p{Xwd}\x{100}-\x{10ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[^[:ascii:]\W]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[^\x00-\x7f\P{Xwd}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
\x{de}
|
||||
0: \x{de}
|
||||
\x{200}
|
||||
0: \x{200}
|
||||
\= Expect no match
|
||||
\x{300}
|
||||
No match
|
||||
\x{37e}
|
||||
No match
|
||||
|
||||
/[[:^ascii:]a]/utf,ucp,bincode
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[a\x80-\xff] (neg)
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/L(?#(|++<!(2)?/B,utf,no_auto_possess,auto_callout
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
Callout 255 0 14
|
||||
L?
|
||||
Callout 255 14 0
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/L(?#(|++<!(2)?/B,utf,ucp,auto_callout
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
Callout 255 0 14
|
||||
L?+
|
||||
Callout 255 14 0
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/(*UTF)C\x09((?<!'(?x)!*H? #\xcc\x9a[^$]/
|
||||
Failed: error 114 at offset 39: missing closing parenthesis
|
||||
|
||||
/[\D]/utf
|
||||
\x{1d7cf}
|
||||
0: \x{1d7cf}
|
||||
|
||||
/[\D\P{Nd}]/utf
|
||||
\x{1d7cf}
|
||||
0: \x{1d7cf}
|
||||
|
||||
/[^\D]/utf
|
||||
a9b
|
||||
0: 9
|
||||
\= Expect no match
|
||||
\x{1d7cf}
|
||||
No match
|
||||
|
||||
/[^\D\P{Nd}]/utf
|
||||
a9b
|
||||
0: 9
|
||||
\x{1d7cf}
|
||||
0: \x{1d7cf}
|
||||
\= Expect no match
|
||||
\x{10000}
|
||||
No match
|
||||
|
||||
# Hex uses pattern length, not zero-terminated. This tests for overrunning
|
||||
# the given length of a pattern.
|
||||
|
||||
/'(*UTF)'/hex
|
||||
|
||||
/'#('/hex,extended,utf
|
||||
|
||||
/a(?<=A\XB)/utf
|
||||
Failed: error 125 at offset 1: lookbehind assertion is not fixed length
|
||||
|
||||
/ab(?<=A\RB)/utf
|
||||
Failed: error 125 at offset 2: lookbehind assertion is not fixed length
|
||||
|
||||
/../utf,auto_callout
|
||||
\n\x{123}\x{123}\x{123}\x{123}
|
||||
--->\x{0a}\x{123}\x{123}\x{123}\x{123}
|
||||
+0 ^ .
|
||||
+0 ^ .
|
||||
+1 ^ ^ .
|
||||
+2 ^ ^
|
||||
0: \x{123}\x{123}
|
||||
|
||||
# This tests processing wide characters in extended mode.
|
||||
|
||||
/XȀ/x,utf
|
||||
|
||||
# These three test a bug fix that was not clearing up after a locale setting
|
||||
# when the test or a subsequent one matched a wide character.
|
||||
|
||||
//locale=C
|
||||
|
||||
/[\P{Yi}]/utf
|
||||
\x{2f000}
|
||||
0: \x{2f000}
|
||||
|
||||
/[\P{Yi}]/utf,locale=C
|
||||
\x{2f000}
|
||||
0: \x{2f000}
|
||||
|
||||
/^(?<!(?=))/B,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
^
|
||||
AssertB not
|
||||
Assert
|
||||
\x{10385c}
|
||||
Ket
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
# Horizontal and vertical space lists ignore caseless
|
||||
|
||||
/[\HH]/Bi,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[^\HH]/Bi,utf
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[^\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
1023
pcre2/testdata/testoutput6
vendored
1023
pcre2/testdata/testoutput6
vendored
File diff suppressed because it is too large
Load Diff
623
pcre2/testdata/testoutput7
vendored
623
pcre2/testdata/testoutput7
vendored
File diff suppressed because it is too large
Load Diff
@ -1,8 +1,11 @@
|
||||
# These are a few representative patterns whose lengths and offsets are to be
|
||||
# shown when the link size is 2. This is just a doublecheck test to ensure the
|
||||
# sizes don't go horribly wrong when something is changed. The pattern contents
|
||||
# are all themselves checked in other tests. Unicode, including property
|
||||
# support, is required for these tests.
|
||||
# There are two sorts of patterns in this test. A number of them are
|
||||
# representative patterns whose lengths and offsets are checked. This is just a
|
||||
# doublecheck test to ensure the sizes don't go horribly wrong when something
|
||||
# is changed. The operation of these patterns is checked in other tests.
|
||||
#
|
||||
# This file also contains tests whose output varies with code unit size and/or
|
||||
# link size. Unicode support is required for these tests. There are separate
|
||||
# output files for each code unit size and link size.
|
||||
|
||||
#pattern fullbincode,memory
|
||||
|
||||
@ -378,7 +381,7 @@ Options: utf
|
||||
First code unit = 'A'
|
||||
Last code unit = '.'
|
||||
Subject length lower bound = 4
|
||||
|
||||
|
||||
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
|
||||
Memory allocation (code space): 22
|
||||
------------------------------------------------------------------
|
||||
@ -842,11 +845,185 @@ Memory allocation (code space): 14
|
||||
|
||||
# Check the absolute limit on nesting (?| etc. This varies with code unit
|
||||
# width because the workspace is a different number of bytes. It will fail
|
||||
# in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|
|
||||

|
||||
/parens_nest_limit=1000,-fullbincode
|
||||
Failed: error 184 at offset 1540: (?| and/or (?J: or (?x: parentheses are too deeply nested
|
||||
|
||||
# Use "expand" to create some very long patterns with nested parentheses, in
|
||||
# order to test workspace overflow. Again, this varies with code unit width,
|
||||
# and even when it fails in two modes, the error offset differs. It also varies
|
||||
# with link size - hence multiple tests with different values.
|
||||
|
||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
||||
|
||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
||||
|
||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||
|
||||
/(?(1)(?1)){8,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 79 Bra
|
||||
2 70 Once
|
||||
4 6 Cond
|
||||
6 1 Cond ref
|
||||
8 74 Recurse
|
||||
10 6 Ket
|
||||
12 6 Cond
|
||||
14 1 Cond ref
|
||||
16 74 Recurse
|
||||
18 6 Ket
|
||||
20 6 Cond
|
||||
22 1 Cond ref
|
||||
24 74 Recurse
|
||||
26 6 Ket
|
||||
28 6 Cond
|
||||
30 1 Cond ref
|
||||
32 74 Recurse
|
||||
34 6 Ket
|
||||
36 6 Cond
|
||||
38 1 Cond ref
|
||||
40 74 Recurse
|
||||
42 6 Ket
|
||||
44 6 Cond
|
||||
46 1 Cond ref
|
||||
48 74 Recurse
|
||||
50 6 Ket
|
||||
52 6 Cond
|
||||
54 1 Cond ref
|
||||
56 74 Recurse
|
||||
58 6 Ket
|
||||
60 10 SBraPos
|
||||
62 6 SCond
|
||||
64 1 Cond ref
|
||||
66 74 Recurse
|
||||
68 6 Ket
|
||||
70 10 KetRpos
|
||||
72 70 Ket
|
||||
74 3 CBra 1
|
||||
77 3 Ket
|
||||
79 79 Ket
|
||||
81 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0:
|
||||
1:
|
||||
|
||||
/(?(1)|a(?1)b){2,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 43 Bra
|
||||
2 34 Once
|
||||
4 4 Cond
|
||||
6 1 Cond ref
|
||||
8 8 Alt
|
||||
10 a
|
||||
12 38 Recurse
|
||||
14 b
|
||||
16 12 Ket
|
||||
18 16 SBraPos
|
||||
20 4 SCond
|
||||
22 1 Cond ref
|
||||
24 8 Alt
|
||||
26 a
|
||||
28 38 Recurse
|
||||
30 b
|
||||
32 12 Ket
|
||||
34 16 KetRpos
|
||||
36 34 Ket
|
||||
38 3 CBra 1
|
||||
41 3 Ket
|
||||
43 43 Ket
|
||||
45 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcde
|
||||
No match
|
||||
|
||||
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
|
||||
------------------------------------------------------------------
|
||||
0 133 Bra
|
||||
2 41 CBra 1
|
||||
5 2 Recurse
|
||||
7 88 Recurse
|
||||
9 93 Recurse
|
||||
11 98 Recurse
|
||||
13 103 Recurse
|
||||
15 108 Recurse
|
||||
17 113 Recurse
|
||||
19 118 Recurse
|
||||
21 123 Recurse
|
||||
23 123 Recurse
|
||||
25 118 Recurse
|
||||
27 113 Recurse
|
||||
29 108 Recurse
|
||||
31 103 Recurse
|
||||
33 98 Recurse
|
||||
35 93 Recurse
|
||||
37 88 Recurse
|
||||
39 2 Recurse
|
||||
41 0 Recurse
|
||||
43 41 Ket
|
||||
45 41 SCBra 1
|
||||
48 2 Recurse
|
||||
50 88 Recurse
|
||||
52 93 Recurse
|
||||
54 98 Recurse
|
||||
56 103 Recurse
|
||||
58 108 Recurse
|
||||
60 113 Recurse
|
||||
62 118 Recurse
|
||||
64 123 Recurse
|
||||
66 123 Recurse
|
||||
68 118 Recurse
|
||||
70 113 Recurse
|
||||
72 108 Recurse
|
||||
74 103 Recurse
|
||||
76 98 Recurse
|
||||
78 93 Recurse
|
||||
80 88 Recurse
|
||||
82 2 Recurse
|
||||
84 0 Recurse
|
||||
86 41 KetRmax
|
||||
88 3 CBra 2
|
||||
91 3 Ket
|
||||
93 3 CBra 3
|
||||
96 3 Ket
|
||||
98 3 CBra 4
|
||||
101 3 Ket
|
||||
103 3 CBra 5
|
||||
106 3 Ket
|
||||
108 3 CBra 6
|
||||
111 3 Ket
|
||||
113 3 CBra 7
|
||||
116 3 Ket
|
||||
118 3 CBra 8
|
||||
121 3 Ket
|
||||
123 3 CBra 9
|
||||
126 3 Ket
|
||||
128 3 CBra 10
|
||||
131 3 Ket
|
||||
133 133 Ket
|
||||
135 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 10
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
|
||||

|
||||
Failed: error 114 at offset 509: missing closing parenthesis
|
||||
|
||||
fullbincode
|
||||
|
||||
# End of testinput8
|
1026
pcre2/testdata/testoutput8-16-3
vendored
Normal file
1026
pcre2/testdata/testoutput8-16-3
vendored
Normal file
File diff suppressed because it is too large
Load Diff
@ -1,8 +1,11 @@
|
||||
# These are a few representative patterns whose lengths and offsets are to be
|
||||
# shown when the link size is 2. This is just a doublecheck test to ensure the
|
||||
# sizes don't go horribly wrong when something is changed. The pattern contents
|
||||
# are all themselves checked in other tests. Unicode, including property
|
||||
# support, is required for these tests.
|
||||
# There are two sorts of patterns in this test. A number of them are
|
||||
# representative patterns whose lengths and offsets are checked. This is just a
|
||||
# doublecheck test to ensure the sizes don't go horribly wrong when something
|
||||
# is changed. The operation of these patterns is checked in other tests.
|
||||
#
|
||||
# This file also contains tests whose output varies with code unit size and/or
|
||||
# link size. Unicode support is required for these tests. There are separate
|
||||
# output files for each code unit size and link size.
|
||||
|
||||
#pattern fullbincode,memory
|
||||
|
||||
@ -378,7 +381,7 @@ Options: utf
|
||||
First code unit = 'A'
|
||||
Last code unit = '.'
|
||||
Subject length lower bound = 4
|
||||
|
||||
|
||||
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
|
||||
Memory allocation (code space): 44
|
||||
------------------------------------------------------------------
|
||||
@ -842,10 +845,184 @@ Memory allocation (code space): 28
|
||||
|
||||
# Check the absolute limit on nesting (?| etc. This varies with code unit
|
||||
# width because the workspace is a different number of bytes. It will fail
|
||||
# in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|
|
||||

|
||||
/parens_nest_limit=1000,-fullbincode
|
||||
|
||||
# Use "expand" to create some very long patterns with nested parentheses, in
|
||||
# order to test workspace overflow. Again, this varies with code unit width,
|
||||
# and even when it fails in two modes, the error offset differs. It also varies
|
||||
# with link size - hence multiple tests with different values.
|
||||
|
||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
||||
|
||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
||||
|
||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||
|
||||
/(?(1)(?1)){8,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 79 Bra
|
||||
2 70 Once
|
||||
4 6 Cond
|
||||
6 1 Cond ref
|
||||
8 74 Recurse
|
||||
10 6 Ket
|
||||
12 6 Cond
|
||||
14 1 Cond ref
|
||||
16 74 Recurse
|
||||
18 6 Ket
|
||||
20 6 Cond
|
||||
22 1 Cond ref
|
||||
24 74 Recurse
|
||||
26 6 Ket
|
||||
28 6 Cond
|
||||
30 1 Cond ref
|
||||
32 74 Recurse
|
||||
34 6 Ket
|
||||
36 6 Cond
|
||||
38 1 Cond ref
|
||||
40 74 Recurse
|
||||
42 6 Ket
|
||||
44 6 Cond
|
||||
46 1 Cond ref
|
||||
48 74 Recurse
|
||||
50 6 Ket
|
||||
52 6 Cond
|
||||
54 1 Cond ref
|
||||
56 74 Recurse
|
||||
58 6 Ket
|
||||
60 10 SBraPos
|
||||
62 6 SCond
|
||||
64 1 Cond ref
|
||||
66 74 Recurse
|
||||
68 6 Ket
|
||||
70 10 KetRpos
|
||||
72 70 Ket
|
||||
74 3 CBra 1
|
||||
77 3 Ket
|
||||
79 79 Ket
|
||||
81 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0:
|
||||
1:
|
||||
|
||||
/(?(1)|a(?1)b){2,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 43 Bra
|
||||
2 34 Once
|
||||
4 4 Cond
|
||||
6 1 Cond ref
|
||||
8 8 Alt
|
||||
10 a
|
||||
12 38 Recurse
|
||||
14 b
|
||||
16 12 Ket
|
||||
18 16 SBraPos
|
||||
20 4 SCond
|
||||
22 1 Cond ref
|
||||
24 8 Alt
|
||||
26 a
|
||||
28 38 Recurse
|
||||
30 b
|
||||
32 12 Ket
|
||||
34 16 KetRpos
|
||||
36 34 Ket
|
||||
38 3 CBra 1
|
||||
41 3 Ket
|
||||
43 43 Ket
|
||||
45 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcde
|
||||
No match
|
||||
|
||||
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
|
||||
------------------------------------------------------------------
|
||||
0 133 Bra
|
||||
2 41 CBra 1
|
||||
5 2 Recurse
|
||||
7 88 Recurse
|
||||
9 93 Recurse
|
||||
11 98 Recurse
|
||||
13 103 Recurse
|
||||
15 108 Recurse
|
||||
17 113 Recurse
|
||||
19 118 Recurse
|
||||
21 123 Recurse
|
||||
23 123 Recurse
|
||||
25 118 Recurse
|
||||
27 113 Recurse
|
||||
29 108 Recurse
|
||||
31 103 Recurse
|
||||
33 98 Recurse
|
||||
35 93 Recurse
|
||||
37 88 Recurse
|
||||
39 2 Recurse
|
||||
41 0 Recurse
|
||||
43 41 Ket
|
||||
45 41 SCBra 1
|
||||
48 2 Recurse
|
||||
50 88 Recurse
|
||||
52 93 Recurse
|
||||
54 98 Recurse
|
||||
56 103 Recurse
|
||||
58 108 Recurse
|
||||
60 113 Recurse
|
||||
62 118 Recurse
|
||||
64 123 Recurse
|
||||
66 123 Recurse
|
||||
68 118 Recurse
|
||||
70 113 Recurse
|
||||
72 108 Recurse
|
||||
74 103 Recurse
|
||||
76 98 Recurse
|
||||
78 93 Recurse
|
||||
80 88 Recurse
|
||||
82 2 Recurse
|
||||
84 0 Recurse
|
||||
86 41 KetRmax
|
||||
88 3 CBra 2
|
||||
91 3 Ket
|
||||
93 3 CBra 3
|
||||
96 3 Ket
|
||||
98 3 CBra 4
|
||||
101 3 Ket
|
||||
103 3 CBra 5
|
||||
106 3 Ket
|
||||
108 3 CBra 6
|
||||
111 3 Ket
|
||||
113 3 CBra 7
|
||||
116 3 Ket
|
||||
118 3 CBra 8
|
||||
121 3 Ket
|
||||
123 3 CBra 9
|
||||
126 3 Ket
|
||||
128 3 CBra 10
|
||||
131 3 Ket
|
||||
133 133 Ket
|
||||
135 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 10
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
|
||||

|
||||
Failed: error 114 at offset 509: missing closing parenthesis
|
||||
|
||||
fullbincode
|
||||
|
||||
# End of testinput8
|
1028
pcre2/testdata/testoutput8-32-3
vendored
Normal file
1028
pcre2/testdata/testoutput8-32-3
vendored
Normal file
File diff suppressed because it is too large
Load Diff
1028
pcre2/testdata/testoutput8-32-4
vendored
Normal file
1028
pcre2/testdata/testoutput8-32-4
vendored
Normal file
File diff suppressed because it is too large
Load Diff
@ -1,8 +1,11 @@
|
||||
# These are a few representative patterns whose lengths and offsets are to be
|
||||
# shown when the link size is 2. This is just a doublecheck test to ensure the
|
||||
# sizes don't go horribly wrong when something is changed. The pattern contents
|
||||
# are all themselves checked in other tests. Unicode, including property
|
||||
# support, is required for these tests.
|
||||
# There are two sorts of patterns in this test. A number of them are
|
||||
# representative patterns whose lengths and offsets are checked. This is just a
|
||||
# doublecheck test to ensure the sizes don't go horribly wrong when something
|
||||
# is changed. The operation of these patterns is checked in other tests.
|
||||
#
|
||||
# This file also contains tests whose output varies with code unit size and/or
|
||||
# link size. Unicode support is required for these tests. There are separate
|
||||
# output files for each code unit size and link size.
|
||||
|
||||
#pattern fullbincode,memory
|
||||
|
||||
@ -378,7 +381,7 @@ Options: utf
|
||||
First code unit = 'A'
|
||||
Last code unit = '.'
|
||||
Subject length lower bound = 4
|
||||
|
||||
|
||||
/\x{D55c}\x{ad6d}\x{C5B4}/I,utf
|
||||
Memory allocation (code space): 19
|
||||
------------------------------------------------------------------
|
||||
@ -842,11 +845,184 @@ Memory allocation (code space): 10
|
||||
|
||||
# Check the absolute limit on nesting (?| etc. This varies with code unit
|
||||
# width because the workspace is a different number of bytes. It will fail
|
||||
# in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
# with link size 2 in 8-bit and 16-bit but not in 32-bit.
|
||||
|
||||
/(?|(?|(?J:(?|(?x:(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|(?|
|
||||

|
||||
/parens_nest_limit=1000,-fullbincode
|
||||
Failed: error 184 at offset 1540: (?| and/or (?J: or (?x: parentheses are too deeply nested
|
||||
|
||||
# Use "expand" to create some very long patterns with nested parentheses, in
|
||||
# order to test workspace overflow. Again, this varies with code unit width,
|
||||
# and even when it fails in two modes, the error offset differs. It also varies
|
||||
# with link size - hence multiple tests with different values.
|
||||
|
||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||
|
||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
||||
|
||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||
|
||||
/(?(1)(?1)){8,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 119 Bra
|
||||
3 105 Once
|
||||
6 9 Cond
|
||||
9 1 Cond ref
|
||||
12 111 Recurse
|
||||
15 9 Ket
|
||||
18 9 Cond
|
||||
21 1 Cond ref
|
||||
24 111 Recurse
|
||||
27 9 Ket
|
||||
30 9 Cond
|
||||
33 1 Cond ref
|
||||
36 111 Recurse
|
||||
39 9 Ket
|
||||
42 9 Cond
|
||||
45 1 Cond ref
|
||||
48 111 Recurse
|
||||
51 9 Ket
|
||||
54 9 Cond
|
||||
57 1 Cond ref
|
||||
60 111 Recurse
|
||||
63 9 Ket
|
||||
66 9 Cond
|
||||
69 1 Cond ref
|
||||
72 111 Recurse
|
||||
75 9 Ket
|
||||
78 9 Cond
|
||||
81 1 Cond ref
|
||||
84 111 Recurse
|
||||
87 9 Ket
|
||||
90 15 SBraPos
|
||||
93 9 SCond
|
||||
96 1 Cond ref
|
||||
99 111 Recurse
|
||||
102 9 Ket
|
||||
105 15 KetRpos
|
||||
108 105 Ket
|
||||
111 5 CBra 1
|
||||
116 5 Ket
|
||||
119 119 Ket
|
||||
122 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcd
|
||||
0:
|
||||
1:
|
||||
|
||||
/(?(1)|a(?1)b){2,}+()/debug
|
||||
------------------------------------------------------------------
|
||||
0 61 Bra
|
||||
3 47 Once
|
||||
6 6 Cond
|
||||
9 1 Cond ref
|
||||
12 10 Alt
|
||||
15 a
|
||||
17 53 Recurse
|
||||
20 b
|
||||
22 16 Ket
|
||||
25 22 SBraPos
|
||||
28 6 SCond
|
||||
31 1 Cond ref
|
||||
34 10 Alt
|
||||
37 a
|
||||
39 53 Recurse
|
||||
42 b
|
||||
44 16 Ket
|
||||
47 22 KetRpos
|
||||
50 47 Ket
|
||||
53 5 CBra 1
|
||||
58 5 Ket
|
||||
61 61 Ket
|
||||
64 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
abcde
|
||||
No match
|
||||
|
||||
/((?1)(?2)(?3)(?4)(?5)(?6)(?7)(?8)(?9)(?9)(?8)(?7)(?6)(?5)(?4)(?3)(?2)(?1)(?0)){2,}()()()()()()()()()/debug
|
||||
------------------------------------------------------------------
|
||||
0 205 Bra
|
||||
3 62 CBra 1
|
||||
8 3 Recurse
|
||||
11 133 Recurse
|
||||
14 141 Recurse
|
||||
17 149 Recurse
|
||||
20 157 Recurse
|
||||
23 165 Recurse
|
||||
26 173 Recurse
|
||||
29 181 Recurse
|
||||
32 189 Recurse
|
||||
35 189 Recurse
|
||||
38 181 Recurse
|
||||
41 173 Recurse
|
||||
44 165 Recurse
|
||||
47 157 Recurse
|
||||
50 149 Recurse
|
||||
53 141 Recurse
|
||||
56 133 Recurse
|
||||
59 3 Recurse
|
||||
62 0 Recurse
|
||||
65 62 Ket
|
||||
68 62 SCBra 1
|
||||
73 3 Recurse
|
||||
76 133 Recurse
|
||||
79 141 Recurse
|
||||
82 149 Recurse
|
||||
85 157 Recurse
|
||||
88 165 Recurse
|
||||
91 173 Recurse
|
||||
94 181 Recurse
|
||||
97 189 Recurse
|
||||
100 189 Recurse
|
||||
103 181 Recurse
|
||||
106 173 Recurse
|
||||
109 165 Recurse
|
||||
112 157 Recurse
|
||||
115 149 Recurse
|
||||
118 141 Recurse
|
||||
121 133 Recurse
|
||||
124 3 Recurse
|
||||
127 0 Recurse
|
||||
130 62 KetRmax
|
||||
133 5 CBra 2
|
||||
138 5 Ket
|
||||
141 5 CBra 3
|
||||
146 5 Ket
|
||||
149 5 CBra 4
|
||||
154 5 Ket
|
||||
157 5 CBra 5
|
||||
162 5 Ket
|
||||
165 5 CBra 6
|
||||
170 5 Ket
|
||||
173 5 CBra 7
|
||||
178 5 Ket
|
||||
181 5 CBra 8
|
||||
186 5 Ket
|
||||
189 5 CBra 9
|
||||
194 5 Ket
|
||||
197 5 CBra 10
|
||||
202 5 Ket
|
||||
205 205 Ket
|
||||
208 End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 10
|
||||
May match empty string
|
||||
Subject length lower bound = 0
|
||||
|
||||

|
||||
Failed: error 114 at offset 509: missing closing parenthesis
|
||||
|
||||
fullbincode
|
||||
|
||||
# End of testinput8
|
1026
pcre2/testdata/testoutput8-8-3
vendored
Normal file
1026
pcre2/testdata/testoutput8-8-3
vendored
Normal file
File diff suppressed because it is too large
Load Diff
1026
pcre2/testdata/testoutput8-8-4
vendored
Normal file
1026
pcre2/testdata/testoutput8-8-4
vendored
Normal file
File diff suppressed because it is too large
Load Diff
31
pcre2/testdata/testoutput9
vendored
31
pcre2/testdata/testoutput9
vendored
@ -2,14 +2,10 @@
|
||||
# UTF-8 or Unicode property support. */
|
||||
|
||||
#forbid_utf
|
||||
#newline_default lf any anycrlf
|
||||
|
||||
/a\Cb/
|
||||
aXb
|
||||
0: aXb
|
||||
a\nb
|
||||
0: a\x0ab
|
||||
** Failers (too big char)
|
||||
No match
|
||||
/ab/
|
||||
\= Expect error message (too big char) and no match
|
||||
A\x{123}B
|
||||
** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
|
||||
** Truncation will probably give the wrong result.
|
||||
@ -311,22 +307,31 @@ Subject length lower bound = 1
|
||||
------------------------------------------------------------------
|
||||
|
||||
/\777/I
|
||||
Failed: error 151 at offset 3: octal value is greater than \377 in 8-bit non-UTF-8 mode
|
||||
Failed: error 151 at offset 4: octal value is greater than \377 in 8-bit non-UTF-8 mode
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
|
||||
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||
XX
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
|
||||
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||
XX
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
||||
XX
|
||||
0: XX
|
||||
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
|
||||
|
||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
|
||||
XX
|
||||
0: XX
|
||||
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
|
||||
|
||||
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
Failed: error 177 at offset 5: character code point value in \u.... sequence is too large
|
||||
Failed: error 177 at offset 6: character code point value in \u.... sequence is too large
|
||||
|
||||
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||
Failed: error 177 at offset 6: character code point value in \u.... sequence is too large
|
||||
Failed: error 177 at offset 7: character code point value in \u.... sequence is too large
|
||||
|
||||
/[^\x00-a]{12,}[^b-\xff]*/B
|
||||
------------------------------------------------------------------
|
||||
@ -356,4 +361,10 @@ Failed: error 177 at offset 6: character code point value in \u.... sequence is
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/(*MARK:a\x{100}b)z/alt_verbnames
|
||||
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
|
||||
|
||||
/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/
|
||||
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||
|
||||
# End of testinput9
|
||||
|
15
pcre2/testdata/valgrind-jit.supp
vendored
Normal file
15
pcre2/testdata/valgrind-jit.supp
vendored
Normal file
@ -0,0 +1,15 @@
|
||||
{
|
||||
name
|
||||
Memcheck:Addr16
|
||||
obj:???
|
||||
obj:???
|
||||
obj:???
|
||||
}
|
||||
|
||||
{
|
||||
name
|
||||
Memcheck:Cond
|
||||
obj:???
|
||||
obj:???
|
||||
obj:???
|
||||
}
|
8
pcre2/testdata/wintestoutput3
vendored
8
pcre2/testdata/wintestoutput3
vendored
@ -159,7 +159,7 @@ No match
|
||||
/[[:alpha:]][[:lower:]][[:upper:]]/IB
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[A-Za-z\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb2\xb3\xb5\xb9\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
|
||||
[A-Za-z\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
|
||||
[a-z\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
|
||||
[A-Z\x8a\x8c\x8e\x9f\xc0-\xd6\xd8-\xde]
|
||||
Ket
|
||||
@ -167,9 +167,9 @@ No match
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 0
|
||||
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
||||
a b c d e f g h i j k l m n o p q r s t u v w x y z � � � � � � � � � � �
|
||||
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
|
||||
� � � � � � � � � � � � � � � � � � � � � � � � � � � �
|
||||
a b c d e f g h i j k l m n o p q r s t u v w x y z � � � � � � � � � � �
|
||||
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
|
||||
� � � � � � � � � � � � � � � � � � � � � � � � �
|
||||
Subject length lower bound = 3
|
||||
|
||||
# End of testinput3
|
||||
|
Reference in New Issue
Block a user