/[pcre]/code/trunk/testdata/testoutput15
ViewVC logotype

Contents of /code/trunk/testdata/testoutput15

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1261 - (show annotations)
Wed Feb 27 16:27:01 2013 UTC (6 years, 5 months ago) by ph10
File size: 28107 byte(s)
Correct Unicode string checking in the light of corrigendum #9.
1 /-- This set of tests is for UTF-8 support, and is relevant only to the 8-bit
2 library. --/
3
4 /X(\C{3})/8
5 X\x{1234}
6 0: X\x{1234}
7 1: \x{1234}
8
9 /X(\C{4})/8
10 X\x{1234}YZ
11 0: X\x{1234}Y
12 1: \x{1234}Y
13
14 /X\C*/8
15 XYZabcdce
16 0: XYZabcdce
17
18 /X\C*?/8
19 XYZabcde
20 0: X
21
22 /X\C{3,5}/8
23 Xabcdefg
24 0: Xabcde
25 X\x{1234}
26 0: X\x{1234}
27 X\x{1234}YZ
28 0: X\x{1234}YZ
29 X\x{1234}\x{512}
30 0: X\x{1234}\x{512}
31 X\x{1234}\x{512}YZ
32 0: X\x{1234}\x{512}
33
34 /X\C{3,5}?/8
35 Xabcdefg
36 0: Xabc
37 X\x{1234}
38 0: X\x{1234}
39 X\x{1234}YZ
40 0: X\x{1234}
41 X\x{1234}\x{512}
42 0: X\x{1234}
43
44 /a\Cb/8
45 aXb
46 0: aXb
47 a\nb
48 0: a\x{0a}b
49
50 /a\C\Cb/8
51 a\x{100}b
52 0: a\x{100}b
53
54 /ab\Cde/8
55 abXde
56 0: abXde
57
58 /a\C\Cb/8
59 a\x{100}b
60 0: a\x{100}b
61 ** Failers
62 No match
63 a\x{12257}b
64 No match
65
66 /[]/8
67 Failed: invalid UTF-8 string at offset 1
68
69 //8
70 Failed: invalid UTF-8 string at offset 0
71
72 /xxx/8
73 Failed: invalid UTF-8 string at offset 0
74
75 /xxx/8?DZSS
76 ------------------------------------------------------------------
77 Bra
78 \X{c0}\X{c0}\X{c0}xxx
79 Ket
80 End
81 ------------------------------------------------------------------
82 Capturing subpattern count = 0
83 Options: utf no_utf_check
84 First char = \x{c3}
85 Need char = 'x'
86
87 /badutf/8
88 \xdf
89 Error -10 (bad UTF-8 string) offset=0 reason=1
90 \xef
91 Error -10 (bad UTF-8 string) offset=0 reason=2
92 \xef\x80
93 Error -10 (bad UTF-8 string) offset=0 reason=1
94 \xf7
95 Error -10 (bad UTF-8 string) offset=0 reason=3
96 \xf7\x80
97 Error -10 (bad UTF-8 string) offset=0 reason=2
98 \xf7\x80\x80
99 Error -10 (bad UTF-8 string) offset=0 reason=1
100 \xfb
101 Error -10 (bad UTF-8 string) offset=0 reason=4
102 \xfb\x80
103 Error -10 (bad UTF-8 string) offset=0 reason=3
104 \xfb\x80\x80
105 Error -10 (bad UTF-8 string) offset=0 reason=2
106 \xfb\x80\x80\x80
107 Error -10 (bad UTF-8 string) offset=0 reason=1
108 \xfd
109 Error -10 (bad UTF-8 string) offset=0 reason=5
110 \xfd\x80
111 Error -10 (bad UTF-8 string) offset=0 reason=4
112 \xfd\x80\x80
113 Error -10 (bad UTF-8 string) offset=0 reason=3
114 \xfd\x80\x80\x80
115 Error -10 (bad UTF-8 string) offset=0 reason=2
116 \xfd\x80\x80\x80\x80
117 Error -10 (bad UTF-8 string) offset=0 reason=1
118 \xdf\x7f
119 Error -10 (bad UTF-8 string) offset=0 reason=6
120 \xef\x7f\x80
121 Error -10 (bad UTF-8 string) offset=0 reason=6
122 \xef\x80\x7f
123 Error -10 (bad UTF-8 string) offset=0 reason=7
124 \xf7\x7f\x80\x80
125 Error -10 (bad UTF-8 string) offset=0 reason=6
126 \xf7\x80\x7f\x80
127 Error -10 (bad UTF-8 string) offset=0 reason=7
128 \xf7\x80\x80\x7f
129 Error -10 (bad UTF-8 string) offset=0 reason=8
130 \xfb\x7f\x80\x80\x80
131 Error -10 (bad UTF-8 string) offset=0 reason=6
132 \xfb\x80\x7f\x80\x80
133 Error -10 (bad UTF-8 string) offset=0 reason=7
134 \xfb\x80\x80\x7f\x80
135 Error -10 (bad UTF-8 string) offset=0 reason=8
136 \xfb\x80\x80\x80\x7f
137 Error -10 (bad UTF-8 string) offset=0 reason=9
138 \xfd\x7f\x80\x80\x80\x80
139 Error -10 (bad UTF-8 string) offset=0 reason=6
140 \xfd\x80\x7f\x80\x80\x80
141 Error -10 (bad UTF-8 string) offset=0 reason=7
142 \xfd\x80\x80\x7f\x80\x80
143 Error -10 (bad UTF-8 string) offset=0 reason=8
144 \xfd\x80\x80\x80\x7f\x80
145 Error -10 (bad UTF-8 string) offset=0 reason=9
146 \xfd\x80\x80\x80\x80\x7f
147 Error -10 (bad UTF-8 string) offset=0 reason=10
148 \xed\xa0\x80
149 Error -10 (bad UTF-8 string) offset=0 reason=14
150 \xc0\x8f
151 Error -10 (bad UTF-8 string) offset=0 reason=15
152 \xe0\x80\x8f
153 Error -10 (bad UTF-8 string) offset=0 reason=16
154 \xf0\x80\x80\x8f
155 Error -10 (bad UTF-8 string) offset=0 reason=17
156 \xf8\x80\x80\x80\x8f
157 Error -10 (bad UTF-8 string) offset=0 reason=18
158 \xfc\x80\x80\x80\x80\x8f
159 Error -10 (bad UTF-8 string) offset=0 reason=19
160 \x80
161 Error -10 (bad UTF-8 string) offset=0 reason=20
162 \xfe
163 Error -10 (bad UTF-8 string) offset=0 reason=21
164 \xff
165 Error -10 (bad UTF-8 string) offset=0 reason=21
166
167 /badutf/8
168 \xfb\x80\x80\x80\x80
169 Error -10 (bad UTF-8 string) offset=0 reason=11
170 \xfd\x80\x80\x80\x80\x80
171 Error -10 (bad UTF-8 string) offset=0 reason=12
172 \xf7\xbf\xbf\xbf
173 Error -10 (bad UTF-8 string) offset=0 reason=13
174
175 /shortutf/8
176 \P\P\xdf
177 Error -25 (short UTF-8 string) offset=0 reason=1
178 \P\P\xef
179 Error -25 (short UTF-8 string) offset=0 reason=2
180 \P\P\xef\x80
181 Error -25 (short UTF-8 string) offset=0 reason=1
182 \P\P\xf7
183 Error -25 (short UTF-8 string) offset=0 reason=3
184 \P\P\xf7\x80
185 Error -25 (short UTF-8 string) offset=0 reason=2
186 \P\P\xf7\x80\x80
187 Error -25 (short UTF-8 string) offset=0 reason=1
188 \P\P\xfb
189 Error -25 (short UTF-8 string) offset=0 reason=4
190 \P\P\xfb\x80
191 Error -25 (short UTF-8 string) offset=0 reason=3
192 \P\P\xfb\x80\x80
193 Error -25 (short UTF-8 string) offset=0 reason=2
194 \P\P\xfb\x80\x80\x80
195 Error -25 (short UTF-8 string) offset=0 reason=1
196 \P\P\xfd
197 Error -25 (short UTF-8 string) offset=0 reason=5
198 \P\P\xfd\x80
199 Error -25 (short UTF-8 string) offset=0 reason=4
200 \P\P\xfd\x80\x80
201 Error -25 (short UTF-8 string) offset=0 reason=3
202 \P\P\xfd\x80\x80\x80
203 Error -25 (short UTF-8 string) offset=0 reason=2
204 \P\P\xfd\x80\x80\x80\x80
205 Error -25 (short UTF-8 string) offset=0 reason=1
206
207 /anything/8
208 \xc0\x80
209 Error -10 (bad UTF-8 string) offset=0 reason=15
210 \xc1\x8f
211 Error -10 (bad UTF-8 string) offset=0 reason=15
212 \xe0\x9f\x80
213 Error -10 (bad UTF-8 string) offset=0 reason=16
214 \xf0\x8f\x80\x80
215 Error -10 (bad UTF-8 string) offset=0 reason=17
216 \xf8\x87\x80\x80\x80
217 Error -10 (bad UTF-8 string) offset=0 reason=18
218 \xfc\x83\x80\x80\x80\x80
219 Error -10 (bad UTF-8 string) offset=0 reason=19
220 \xfe\x80\x80\x80\x80\x80
221 Error -10 (bad UTF-8 string) offset=0 reason=21
222 \xff\x80\x80\x80\x80\x80
223 Error -10 (bad UTF-8 string) offset=0 reason=21
224 \xc3\x8f
225 No match
226 \xe0\xaf\x80
227 No match
228 \xe1\x80\x80
229 No match
230 \xf0\x9f\x80\x80
231 No match
232 \xf1\x8f\x80\x80
233 No match
234 \xf8\x88\x80\x80\x80
235 Error -10 (bad UTF-8 string) offset=0 reason=11
236 \xf9\x87\x80\x80\x80
237 Error -10 (bad UTF-8 string) offset=0 reason=11
238 \xfc\x84\x80\x80\x80\x80
239 Error -10 (bad UTF-8 string) offset=0 reason=12
240 \xfd\x83\x80\x80\x80\x80
241 Error -10 (bad UTF-8 string) offset=0 reason=12
242 \?\xf8\x88\x80\x80\x80
243 No match
244 \?\xf9\x87\x80\x80\x80
245 No match
246 \?\xfc\x84\x80\x80\x80\x80
247 No match
248 \?\xfd\x83\x80\x80\x80\x80
249 No match
250
251 /./8
252 \x{fffe}
253 0: \x{fffe}
254 \x{ffff}
255 0: \x{ffff}
256 \x{1fffe}
257 0: \x{1fffe}
258 \x{1ffff}
259 0: \x{1ffff}
260 \x{2fffe}
261 0: \x{2fffe}
262 \x{2ffff}
263 0: \x{2ffff}
264 \x{3fffe}
265 0: \x{3fffe}
266 \x{3ffff}
267 0: \x{3ffff}
268 \x{4fffe}
269 0: \x{4fffe}
270 \x{4ffff}
271 0: \x{4ffff}
272 \x{5fffe}
273 0: \x{5fffe}
274 \x{5ffff}
275 0: \x{5ffff}
276 \x{6fffe}
277 0: \x{6fffe}
278 \x{6ffff}
279 0: \x{6ffff}
280 \x{7fffe}
281 0: \x{7fffe}
282 \x{7ffff}
283 0: \x{7ffff}
284 \x{8fffe}
285 0: \x{8fffe}
286 \x{8ffff}
287 0: \x{8ffff}
288 \x{9fffe}
289 0: \x{9fffe}
290 \x{9ffff}
291 0: \x{9ffff}
292 \x{afffe}
293 0: \x{afffe}
294 \x{affff}
295 0: \x{affff}
296 \x{bfffe}
297 0: \x{bfffe}
298 \x{bffff}
299 0: \x{bffff}
300 \x{cfffe}
301 0: \x{cfffe}
302 \x{cffff}
303 0: \x{cffff}
304 \x{dfffe}
305 0: \x{dfffe}
306 \x{dffff}
307 0: \x{dffff}
308 \x{efffe}
309 0: \x{efffe}
310 \x{effff}
311 0: \x{effff}
312 \x{ffffe}
313 0: \x{ffffe}
314 \x{fffff}
315 0: \x{fffff}
316 \x{10fffe}
317 0: \x{10fffe}
318 \x{10ffff}
319 0: \x{10ffff}
320 \x{fdd0}
321 0: \x{fdd0}
322 \x{fdd1}
323 0: \x{fdd1}
324 \x{fdd2}
325 0: \x{fdd2}
326 \x{fdd3}
327 0: \x{fdd3}
328 \x{fdd4}
329 0: \x{fdd4}
330 \x{fdd5}
331 0: \x{fdd5}
332 \x{fdd6}
333 0: \x{fdd6}
334 \x{fdd7}
335 0: \x{fdd7}
336 \x{fdd8}
337 0: \x{fdd8}
338 \x{fdd9}
339 0: \x{fdd9}
340 \x{fdda}
341 0: \x{fdda}
342 \x{fddb}
343 0: \x{fddb}
344 \x{fddc}
345 0: \x{fddc}
346 \x{fddd}
347 0: \x{fddd}
348 \x{fdde}
349 0: \x{fdde}
350 \x{fddf}
351 0: \x{fddf}
352 \x{fde0}
353 0: \x{fde0}
354 \x{fde1}
355 0: \x{fde1}
356 \x{fde2}
357 0: \x{fde2}
358 \x{fde3}
359 0: \x{fde3}
360 \x{fde4}
361 0: \x{fde4}
362 \x{fde5}
363 0: \x{fde5}
364 \x{fde6}
365 0: \x{fde6}
366 \x{fde7}
367 0: \x{fde7}
368 \x{fde8}
369 0: \x{fde8}
370 \x{fde9}
371 0: \x{fde9}
372 \x{fdea}
373 0: \x{fdea}
374 \x{fdeb}
375 0: \x{fdeb}
376 \x{fdec}
377 0: \x{fdec}
378 \x{fded}
379 0: \x{fded}
380 \x{fdee}
381 0: \x{fdee}
382 \x{fdef}
383 0: \x{fdef}
384
385 /\x{100}/8DZ
386 ------------------------------------------------------------------
387 Bra
388 \x{100}
389 Ket
390 End
391 ------------------------------------------------------------------
392 Capturing subpattern count = 0
393 Options: utf
394 First char = \x{c4}
395 Need char = \x{80}
396
397 /\x{1000}/8DZ
398 ------------------------------------------------------------------
399 Bra
400 \x{1000}
401 Ket
402 End
403 ------------------------------------------------------------------
404 Capturing subpattern count = 0
405 Options: utf
406 First char = \x{e1}
407 Need char = \x{80}
408
409 /\x{10000}/8DZ
410 ------------------------------------------------------------------
411 Bra
412 \x{10000}
413 Ket
414 End
415 ------------------------------------------------------------------
416 Capturing subpattern count = 0
417 Options: utf
418 First char = \x{f0}
419 Need char = \x{80}
420
421 /\x{100000}/8DZ
422 ------------------------------------------------------------------
423 Bra
424 \x{100000}
425 Ket
426 End
427 ------------------------------------------------------------------
428 Capturing subpattern count = 0
429 Options: utf
430 First char = \x{f4}
431 Need char = \x{80}
432
433 /\x{10ffff}/8DZ
434 ------------------------------------------------------------------
435 Bra
436 \x{10ffff}
437 Ket
438 End
439 ------------------------------------------------------------------
440 Capturing subpattern count = 0
441 Options: utf
442 First char = \x{f4}
443 Need char = \x{bf}
444
445 /[\x{ff}]/8DZ
446 ------------------------------------------------------------------
447 Bra
448 \x{ff}
449 Ket
450 End
451 ------------------------------------------------------------------
452 Capturing subpattern count = 0
453 Options: utf
454 First char = \x{c3}
455 Need char = \x{bf}
456
457 /[\x{100}]/8DZ
458 ------------------------------------------------------------------
459 Bra
460 \x{100}
461 Ket
462 End
463 ------------------------------------------------------------------
464 Capturing subpattern count = 0
465 Options: utf
466 First char = \x{c4}
467 Need char = \x{80}
468
469 /\x80/8DZ
470 ------------------------------------------------------------------
471 Bra
472 \x{80}
473 Ket
474 End
475 ------------------------------------------------------------------
476 Capturing subpattern count = 0
477 Options: utf
478 First char = \x{c2}
479 Need char = \x{80}
480
481 /\xff/8DZ
482 ------------------------------------------------------------------
483 Bra
484 \x{ff}
485 Ket
486 End
487 ------------------------------------------------------------------
488 Capturing subpattern count = 0
489 Options: utf
490 First char = \x{c3}
491 Need char = \x{bf}
492
493 /\x{D55c}\x{ad6d}\x{C5B4}/DZ8
494 ------------------------------------------------------------------
495 Bra
496 \x{d55c}\x{ad6d}\x{c5b4}
497 Ket
498 End
499 ------------------------------------------------------------------
500 Capturing subpattern count = 0
501 Options: utf
502 First char = \x{ed}
503 Need char = \x{b4}
504 \x{D55c}\x{ad6d}\x{C5B4}
505 0: \x{d55c}\x{ad6d}\x{c5b4}
506
507 /\x{65e5}\x{672c}\x{8a9e}/DZ8
508 ------------------------------------------------------------------
509 Bra
510 \x{65e5}\x{672c}\x{8a9e}
511 Ket
512 End
513 ------------------------------------------------------------------
514 Capturing subpattern count = 0
515 Options: utf
516 First char = \x{e6}
517 Need char = \x{9e}
518 \x{65e5}\x{672c}\x{8a9e}
519 0: \x{65e5}\x{672c}\x{8a9e}
520
521 /\x{80}/DZ8
522 ------------------------------------------------------------------
523 Bra
524 \x{80}
525 Ket
526 End
527 ------------------------------------------------------------------
528 Capturing subpattern count = 0
529 Options: utf
530 First char = \x{c2}
531 Need char = \x{80}
532
533 /\x{084}/DZ8
534 ------------------------------------------------------------------
535 Bra
536 \x{84}
537 Ket
538 End
539 ------------------------------------------------------------------
540 Capturing subpattern count = 0
541 Options: utf
542 First char = \x{c2}
543 Need char = \x{84}
544
545 /\x{104}/DZ8
546 ------------------------------------------------------------------
547 Bra
548 \x{104}
549 Ket
550 End
551 ------------------------------------------------------------------
552 Capturing subpattern count = 0
553 Options: utf
554 First char = \x{c4}
555 Need char = \x{84}
556
557 /\x{861}/DZ8
558 ------------------------------------------------------------------
559 Bra
560 \x{861}
561 Ket
562 End
563 ------------------------------------------------------------------
564 Capturing subpattern count = 0
565 Options: utf
566 First char = \x{e0}
567 Need char = \x{a1}
568
569 /\x{212ab}/DZ8
570 ------------------------------------------------------------------
571 Bra
572 \x{212ab}
573 Ket
574 End
575 ------------------------------------------------------------------
576 Capturing subpattern count = 0
577 Options: utf
578 First char = \x{f0}
579 Need char = \x{ab}
580
581 /-- This one is here not because it's different to Perl, but because the way
582 the captured single-byte is displayed. (In Perl it becomes a character, and you
583 can't tell the difference.) --/
584
585 /X(\C)(.*)/8
586 X\x{1234}
587 0: X\x{1234}
588 1: \x{e1}
589 2: \x{88}\x{b4}
590 X\nabc
591 0: X\x{0a}abc
592 1: \x{0a}
593 2: abc
594
595 /-- This one is here because Perl gives out a grumbly error message (quite
596 correctly, but that messes up comparisons). --/
597
598 /a\Cb/8
599 *** Failers
600 No match
601 a\x{100}b
602 No match
603
604 /[^ab\xC0-\xF0]/8SDZ
605 ------------------------------------------------------------------
606 Bra
607 [\x00-`c-\xbf\xf1-\xff] (neg)
608 Ket
609 End
610 ------------------------------------------------------------------
611 Capturing subpattern count = 0
612 Options: utf
613 No first char
614 No need char
615 Subject length lower bound = 1
616 Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
617 \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
618 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
619 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
620 Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
621 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0
622 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf
623 \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee
624 \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd
625 \xfe \xff
626 \x{f1}
627 0: \x{f1}
628 \x{bf}
629 0: \x{bf}
630 \x{100}
631 0: \x{100}
632 \x{1000}
633 0: \x{1000}
634 *** Failers
635 0: *
636 \x{c0}
637 No match
638 \x{f0}
639 No match
640
641 /Ā{3,4}/8SDZ
642 ------------------------------------------------------------------
643 Bra
644 \x{100}{3}
645 \x{100}?
646 Ket
647 End
648 ------------------------------------------------------------------
649 Capturing subpattern count = 0
650 Options: utf
651 First char = \x{c4}
652 Need char = \x{80}
653 Subject length lower bound = 3
654 No set of starting bytes
655 \x{100}\x{100}\x{100}\x{100\x{100}
656 0: \x{100}\x{100}\x{100}
657
658 /(\x{100}+|x)/8SDZ
659 ------------------------------------------------------------------
660 Bra
661 CBra 1
662 \x{100}+
663 Alt
664 x
665 Ket
666 Ket
667 End
668 ------------------------------------------------------------------
669 Capturing subpattern count = 1
670 Options: utf
671 No first char
672 No need char
673 Subject length lower bound = 1
674 Starting byte set: x \xc4
675
676 /(\x{100}*a|x)/8SDZ
677 ------------------------------------------------------------------
678 Bra
679 CBra 1
680 \x{100}*+
681 a
682 Alt
683 x
684 Ket
685 Ket
686 End
687 ------------------------------------------------------------------
688 Capturing subpattern count = 1
689 Options: utf
690 No first char
691 No need char
692 Subject length lower bound = 1
693 Starting byte set: a x \xc4
694
695 /(\x{100}{0,2}a|x)/8SDZ
696 ------------------------------------------------------------------
697 Bra
698 CBra 1
699 \x{100}{0,2}
700 a
701 Alt
702 x
703 Ket
704 Ket
705 End
706 ------------------------------------------------------------------
707 Capturing subpattern count = 1
708 Options: utf
709 No first char
710 No need char
711 Subject length lower bound = 1
712 Starting byte set: a x \xc4
713
714 /(\x{100}{1,2}a|x)/8SDZ
715 ------------------------------------------------------------------
716 Bra
717 CBra 1
718 \x{100}
719 \x{100}{0,1}
720 a
721 Alt
722 x
723 Ket
724 Ket
725 End
726 ------------------------------------------------------------------
727 Capturing subpattern count = 1
728 Options: utf
729 No first char
730 No need char
731 Subject length lower bound = 1
732 Starting byte set: x \xc4
733
734 /\x{100}/8DZ
735 ------------------------------------------------------------------
736 Bra
737 \x{100}
738 Ket
739 End
740 ------------------------------------------------------------------
741 Capturing subpattern count = 0
742 Options: utf
743 First char = \x{c4}
744 Need char = \x{80}
745
746 /a\x{100}\x{101}*/8DZ
747 ------------------------------------------------------------------
748 Bra
749 a\x{100}
750 \x{101}*
751 Ket
752 End
753 ------------------------------------------------------------------
754 Capturing subpattern count = 0
755 Options: utf
756 First char = 'a'
757 Need char = \x{80}
758
759 /a\x{100}\x{101}+/8DZ
760 ------------------------------------------------------------------
761 Bra
762 a\x{100}
763 \x{101}+
764 Ket
765 End
766 ------------------------------------------------------------------
767 Capturing subpattern count = 0
768 Options: utf
769 First char = 'a'
770 Need char = \x{81}
771
772 /[^\x{c4}]/DZ
773 ------------------------------------------------------------------
774 Bra
775 [^\x{c4}]
776 Ket
777 End
778 ------------------------------------------------------------------
779 Capturing subpattern count = 0
780 No options
781 No first char
782 No need char
783
784 /[\x{100}]/8DZ
785 ------------------------------------------------------------------
786 Bra
787 \x{100}
788 Ket
789 End
790 ------------------------------------------------------------------
791 Capturing subpattern count = 0
792 Options: utf
793 First char = \x{c4}
794 Need char = \x{80}
795 \x{100}
796 0: \x{100}
797 Z\x{100}
798 0: \x{100}
799 \x{100}Z
800 0: \x{100}
801 *** Failers
802 No match
803
804 /[\xff]/DZ8
805 ------------------------------------------------------------------
806 Bra
807 \x{ff}
808 Ket
809 End
810 ------------------------------------------------------------------
811 Capturing subpattern count = 0
812 Options: utf
813 First char = \x{c3}
814 Need char = \x{bf}
815 >\x{ff}<
816 0: \x{ff}
817
818 /[^\xff]/8DZ
819 ------------------------------------------------------------------
820 Bra
821 [^\x{ff}]
822 Ket
823 End
824 ------------------------------------------------------------------
825 Capturing subpattern count = 0
826 Options: utf
827 No first char
828 No need char
829
830 /\x{100}abc(xyz(?1))/8DZ
831 ------------------------------------------------------------------
832 Bra
833 \x{100}abc
834 CBra 1
835 xyz
836 Recurse
837 Ket
838 Ket
839 End
840 ------------------------------------------------------------------
841 Capturing subpattern count = 1
842 Options: utf
843 First char = \x{c4}
844 Need char = 'z'
845
846 /a\x{1234}b/P8
847 a\x{1234}b
848 0: a\x{1234}b
849
850 /\777/8I
851 Capturing subpattern count = 0
852 Options: utf
853 First char = \x{c7}
854 Need char = \x{bf}
855 \x{1ff}
856 0: \x{1ff}
857 \777
858 0: \x{1ff}
859
860 /\x{100}+\x{200}/8DZ
861 ------------------------------------------------------------------
862 Bra
863 \x{100}++
864 \x{200}
865 Ket
866 End
867 ------------------------------------------------------------------
868 Capturing subpattern count = 0
869 Options: utf
870 First char = \x{c4}
871 Need char = \x{80}
872
873 /\x{100}+X/8DZ
874 ------------------------------------------------------------------
875 Bra
876 \x{100}++
877 X
878 Ket
879 End
880 ------------------------------------------------------------------
881 Capturing subpattern count = 0
882 Options: utf
883 First char = \x{c4}
884 Need char = 'X'
885
886 /^[\QĀ\E-\QŐ\E/BZ8
887 Failed: missing terminating ] for character class at offset 15
888
889 /-- This tests the stricter UTF-8 check according to RFC 3629. --/
890
891 /X/8
892 \x{d800}
893 Error -10 (bad UTF-8 string) offset=0 reason=14
894 \x{d800}\?
895 No match
896 \x{da00}
897 Error -10 (bad UTF-8 string) offset=0 reason=14
898 \x{da00}\?
899 No match
900 \x{dfff}
901 Error -10 (bad UTF-8 string) offset=0 reason=14
902 \x{dfff}\?
903 No match
904 \x{110000}
905 Error -10 (bad UTF-8 string) offset=0 reason=13
906 \x{110000}\?
907 No match
908 \x{2000000}
909 Error -10 (bad UTF-8 string) offset=0 reason=11
910 \x{2000000}\?
911 No match
912 \x{7fffffff}
913 Error -10 (bad UTF-8 string) offset=0 reason=12
914 \x{7fffffff}\?
915 No match
916
917 /(*UTF8)\x{1234}/
918 abcd\x{1234}pqr
919 0: \x{1234}
920
921 /(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
922 Capturing subpattern count = 0
923 Options: bsr_unicode utf
924 Forced newline sequence: CRLF
925 First char = 'a'
926 Need char = 'b'
927
928 /\h/SI8
929 Capturing subpattern count = 0
930 Options: utf
931 No first char
932 No need char
933 Subject length lower bound = 1
934 Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3
935 ABC\x{09}
936 0: \x{09}
937 ABC\x{20}
938 0:
939 ABC\x{a0}
940 0: \x{a0}
941 ABC\x{1680}
942 0: \x{1680}
943 ABC\x{180e}
944 0: \x{180e}
945 ABC\x{2000}
946 0: \x{2000}
947 ABC\x{202f}
948 0: \x{202f}
949 ABC\x{205f}
950 0: \x{205f}
951 ABC\x{3000}
952 0: \x{3000}
953
954 /\v/SI8
955 Capturing subpattern count = 0
956 Options: utf
957 No first char
958 No need char
959 Subject length lower bound = 1
960 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
961 ABC\x{0a}
962 0: \x{0a}
963 ABC\x{0b}
964 0: \x{0b}
965 ABC\x{0c}
966 0: \x{0c}
967 ABC\x{0d}
968 0: \x{0d}
969 ABC\x{85}
970 0: \x{85}
971 ABC\x{2028}
972 0: \x{2028}
973
974 /\h*A/SI8
975 Capturing subpattern count = 0
976 Options: utf
977 No first char
978 Need char = 'A'
979 Subject length lower bound = 1
980 Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
981 CDBABC
982 0: A
983
984 /\v+A/SI8
985 Capturing subpattern count = 0
986 Options: utf
987 No first char
988 Need char = 'A'
989 Subject length lower bound = 2
990 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
991
992 /\s?xxx\s/8SI
993 Capturing subpattern count = 0
994 Options: utf
995 No first char
996 Need char = 'x'
997 Subject length lower bound = 4
998 Starting byte set: \x09 \x0a \x0c \x0d \x20 x
999
1000 /\sxxx\s/I8ST1
1001 Capturing subpattern count = 0
1002 Options: utf
1003 No first char
1004 Need char = 'x'
1005 Subject length lower bound = 5
1006 Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2
1007 AB\x{85}xxx\x{a0}XYZ
1008 0: \x{85}xxx\x{a0}
1009 AB\x{a0}xxx\x{85}XYZ
1010 0: \x{a0}xxx\x{85}
1011
1012 /\S \S/I8ST1
1013 Capturing subpattern count = 0
1014 Options: utf
1015 No first char
1016 Need char = ' '
1017 Subject length lower bound = 3
1018 Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
1019 \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
1020 \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
1021 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
1022 f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3
1023 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2
1024 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1
1025 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0
1026 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
1027 \x{a2} \x{84}
1028 0: \x{a2} \x{84}
1029 A Z
1030 0: A Z
1031
1032 /a+/8
1033 a\x{123}aa\>1
1034 0: aa
1035 a\x{123}aa\>2
1036 Error -11 (bad UTF-8 offset)
1037 a\x{123}aa\>3
1038 0: aa
1039 a\x{123}aa\>4
1040 0: a
1041 a\x{123}aa\>5
1042 No match
1043 a\x{123}aa\>6
1044 Error -24 (bad offset value)
1045
1046 /\x{1234}+/iS8I
1047 Capturing subpattern count = 0
1048 Options: caseless utf
1049 No first char
1050 No need char
1051 Subject length lower bound = 1
1052 Starting byte set: \xe1
1053
1054 /\x{1234}+?/iS8I
1055 Capturing subpattern count = 0
1056 Options: caseless utf
1057 No first char
1058 No need char
1059 Subject length lower bound = 1
1060 Starting byte set: \xe1
1061
1062 /\x{1234}++/iS8I
1063 Capturing subpattern count = 0
1064 Options: caseless utf
1065 No first char
1066 No need char
1067 Subject length lower bound = 1
1068 Starting byte set: \xe1
1069
1070 /\x{1234}{2}/iS8I
1071 Capturing subpattern count = 0
1072 Options: caseless utf
1073 No first char
1074 No need char
1075 Subject length lower bound = 2
1076 Starting byte set: \xe1
1077
1078 /[^\x{c4}]/8DZ
1079 ------------------------------------------------------------------
1080 Bra
1081 [^\x{c4}]
1082 Ket
1083 End
1084 ------------------------------------------------------------------
1085 Capturing subpattern count = 0
1086 Options: utf
1087 No first char
1088 No need char
1089
1090 /X+\x{200}/8DZ
1091 ------------------------------------------------------------------
1092 Bra
1093 X++
1094 \x{200}
1095 Ket
1096 End
1097 ------------------------------------------------------------------
1098 Capturing subpattern count = 0
1099 Options: utf
1100 First char = 'X'
1101 Need char = \x{80}
1102
1103 /\R/SI8
1104 Capturing subpattern count = 0
1105 Options: utf
1106 No first char
1107 No need char
1108 Subject length lower bound = 1
1109 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
1110
1111 /\777/8DZ
1112 ------------------------------------------------------------------
1113 Bra
1114 \x{1ff}
1115 Ket
1116 End
1117 ------------------------------------------------------------------
1118 Capturing subpattern count = 0
1119 Options: utf
1120 First char = \x{c7}
1121 Need char = \x{bf}
1122
1123 /\w+\x{C4}/8BZ
1124 ------------------------------------------------------------------
1125 Bra
1126 \w++
1127 \x{c4}
1128 Ket
1129 End
1130 ------------------------------------------------------------------
1131 a\x{C4}\x{C4}
1132 0: a\x{c4}
1133
1134 /\w+\x{C4}/8BZT1
1135 ------------------------------------------------------------------
1136 Bra
1137 \w+
1138 \x{c4}
1139 Ket
1140 End
1141 ------------------------------------------------------------------
1142 a\x{C4}\x{C4}
1143 0: a\x{c4}\x{c4}
1144
1145 /\W+\x{C4}/8BZ
1146 ------------------------------------------------------------------
1147 Bra
1148 \W+
1149 \x{c4}
1150 Ket
1151 End
1152 ------------------------------------------------------------------
1153 !\x{C4}
1154 0: !\x{c4}
1155
1156 /\W+\x{C4}/8BZT1
1157 ------------------------------------------------------------------
1158 Bra
1159 \W++
1160 \x{c4}
1161 Ket
1162 End
1163 ------------------------------------------------------------------
1164 !\x{C4}
1165 0: !\x{c4}
1166
1167 /\W+\x{A1}/8BZ
1168 ------------------------------------------------------------------
1169 Bra
1170 \W+
1171 \x{a1}
1172 Ket
1173 End
1174 ------------------------------------------------------------------
1175 !\x{A1}
1176 0: !\x{a1}
1177
1178 /\W+\x{A1}/8BZT1
1179 ------------------------------------------------------------------
1180 Bra
1181 \W+
1182 \x{a1}
1183 Ket
1184 End
1185 ------------------------------------------------------------------
1186 !\x{A1}
1187 0: !\x{a1}
1188
1189 /X\s+\x{A0}/8BZ
1190 ------------------------------------------------------------------
1191 Bra
1192 X
1193 \s++
1194 \x{a0}
1195 Ket
1196 End
1197 ------------------------------------------------------------------
1198 X\x20\x{A0}\x{A0}
1199 0: X \x{a0}
1200
1201 /X\s+\x{A0}/8BZT1
1202 ------------------------------------------------------------------
1203 Bra
1204 X
1205 \s+
1206 \x{a0}
1207 Ket
1208 End
1209 ------------------------------------------------------------------
1210 X\x20\x{A0}\x{A0}
1211 0: X \x{a0}\x{a0}
1212
1213 /\S+\x{A0}/8BZ
1214 ------------------------------------------------------------------
1215 Bra
1216 \S+
1217 \x{a0}
1218 Ket
1219 End
1220 ------------------------------------------------------------------
1221 X\x{A0}\x{A0}
1222 0: X\x{a0}\x{a0}
1223
1224 /\S+\x{A0}/8BZT1
1225 ------------------------------------------------------------------
1226 Bra
1227 \S++
1228 \x{a0}
1229 Ket
1230 End
1231 ------------------------------------------------------------------
1232 X\x{A0}\x{A0}
1233 0: X\x{a0}
1234
1235 /\x{a0}+\s!/8BZ
1236 ------------------------------------------------------------------
1237 Bra
1238 \x{a0}++
1239 \s
1240 !
1241 Ket
1242 End
1243 ------------------------------------------------------------------
1244 \x{a0}\x20!
1245 0: \x{a0} !
1246
1247 /\x{a0}+\s!/8BZT1
1248 ------------------------------------------------------------------
1249 Bra
1250 \x{a0}+
1251 \s
1252 !
1253 Ket
1254 End
1255 ------------------------------------------------------------------
1256 \x{a0}\x20!
1257 0: \x{a0} !
1258
1259 /A/8
1260 \x{ff000041}
1261 ** Character \x{ff000041} is greater than 0x7fffffff and so cannot be converted to UTF-8
1262 \x{7f000041}
1263 Error -10 (bad UTF-8 string) offset=0 reason=12
1264
1265 /-- End of testinput15 --/

  ViewVC Help
Powered by ViewVC 1.1.5