/[pcre]/code/trunk/testdata/testoutput15
ViewVC logotype

Contents of /code/trunk/testdata/testoutput15

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1219 - (show annotations)
Sun Nov 11 18:04:37 2012 UTC (6 years, 9 months ago) by ph10
File size: 30543 byte(s)
Support (*UTF) in all libraries.
1 /-- This set of tests is for UTF-8 support, and is relevant only to the 8-bit
2 library. --/
3
4 /X(\C{3})/8
5 X\x{1234}
6 0: X\x{1234}
7 1: \x{1234}
8
9 /X(\C{4})/8
10 X\x{1234}YZ
11 0: X\x{1234}Y
12 1: \x{1234}Y
13
14 /X\C*/8
15 XYZabcdce
16 0: XYZabcdce
17
18 /X\C*?/8
19 XYZabcde
20 0: X
21
22 /X\C{3,5}/8
23 Xabcdefg
24 0: Xabcde
25 X\x{1234}
26 0: X\x{1234}
27 X\x{1234}YZ
28 0: X\x{1234}YZ
29 X\x{1234}\x{512}
30 0: X\x{1234}\x{512}
31 X\x{1234}\x{512}YZ
32 0: X\x{1234}\x{512}
33
34 /X\C{3,5}?/8
35 Xabcdefg
36 0: Xabc
37 X\x{1234}
38 0: X\x{1234}
39 X\x{1234}YZ
40 0: X\x{1234}
41 X\x{1234}\x{512}
42 0: X\x{1234}
43
44 /a\Cb/8
45 aXb
46 0: aXb
47 a\nb
48 0: a\x{0a}b
49
50 /a\C\Cb/8
51 a\x{100}b
52 0: a\x{100}b
53
54 /ab\Cde/8
55 abXde
56 0: abXde
57
58 /a\C\Cb/8
59 a\x{100}b
60 0: a\x{100}b
61 ** Failers
62 No match
63 a\x{12257}b
64 No match
65
66 /[]/8
67 Failed: invalid UTF-8 string at offset 1
68
69 //8
70 Failed: invalid UTF-8 string at offset 0
71
72 /xxx/8
73 Failed: invalid UTF-8 string at offset 0
74
75 /xxx/8?DZSS
76 ------------------------------------------------------------------
77 Bra
78 \X{c0}\X{c0}\X{c0}xxx
79 Ket
80 End
81 ------------------------------------------------------------------
82 Capturing subpattern count = 0
83 Options: utf no_utf_check
84 First char = \x{c3}
85 Need char = 'x'
86
87 /badutf/8
88 \xdf
89 Error -10 (bad UTF-8 string) offset=0 reason=1
90 \xef
91 Error -10 (bad UTF-8 string) offset=0 reason=2
92 \xef\x80
93 Error -10 (bad UTF-8 string) offset=0 reason=1
94 \xf7
95 Error -10 (bad UTF-8 string) offset=0 reason=3
96 \xf7\x80
97 Error -10 (bad UTF-8 string) offset=0 reason=2
98 \xf7\x80\x80
99 Error -10 (bad UTF-8 string) offset=0 reason=1
100 \xfb
101 Error -10 (bad UTF-8 string) offset=0 reason=4
102 \xfb\x80
103 Error -10 (bad UTF-8 string) offset=0 reason=3
104 \xfb\x80\x80
105 Error -10 (bad UTF-8 string) offset=0 reason=2
106 \xfb\x80\x80\x80
107 Error -10 (bad UTF-8 string) offset=0 reason=1
108 \xfd
109 Error -10 (bad UTF-8 string) offset=0 reason=5
110 \xfd\x80
111 Error -10 (bad UTF-8 string) offset=0 reason=4
112 \xfd\x80\x80
113 Error -10 (bad UTF-8 string) offset=0 reason=3
114 \xfd\x80\x80\x80
115 Error -10 (bad UTF-8 string) offset=0 reason=2
116 \xfd\x80\x80\x80\x80
117 Error -10 (bad UTF-8 string) offset=0 reason=1
118 \xdf\x7f
119 Error -10 (bad UTF-8 string) offset=0 reason=6
120 \xef\x7f\x80
121 Error -10 (bad UTF-8 string) offset=0 reason=6
122 \xef\x80\x7f
123 Error -10 (bad UTF-8 string) offset=0 reason=7
124 \xf7\x7f\x80\x80
125 Error -10 (bad UTF-8 string) offset=0 reason=6
126 \xf7\x80\x7f\x80
127 Error -10 (bad UTF-8 string) offset=0 reason=7
128 \xf7\x80\x80\x7f
129 Error -10 (bad UTF-8 string) offset=0 reason=8
130 \xfb\x7f\x80\x80\x80
131 Error -10 (bad UTF-8 string) offset=0 reason=6
132 \xfb\x80\x7f\x80\x80
133 Error -10 (bad UTF-8 string) offset=0 reason=7
134 \xfb\x80\x80\x7f\x80
135 Error -10 (bad UTF-8 string) offset=0 reason=8
136 \xfb\x80\x80\x80\x7f
137 Error -10 (bad UTF-8 string) offset=0 reason=9
138 \xfd\x7f\x80\x80\x80\x80
139 Error -10 (bad UTF-8 string) offset=0 reason=6
140 \xfd\x80\x7f\x80\x80\x80
141 Error -10 (bad UTF-8 string) offset=0 reason=7
142 \xfd\x80\x80\x7f\x80\x80
143 Error -10 (bad UTF-8 string) offset=0 reason=8
144 \xfd\x80\x80\x80\x7f\x80
145 Error -10 (bad UTF-8 string) offset=0 reason=9
146 \xfd\x80\x80\x80\x80\x7f
147 Error -10 (bad UTF-8 string) offset=0 reason=10
148 \xed\xa0\x80
149 Error -10 (bad UTF-8 string) offset=0 reason=14
150 \xc0\x8f
151 Error -10 (bad UTF-8 string) offset=0 reason=15
152 \xe0\x80\x8f
153 Error -10 (bad UTF-8 string) offset=0 reason=16
154 \xf0\x80\x80\x8f
155 Error -10 (bad UTF-8 string) offset=0 reason=17
156 \xf8\x80\x80\x80\x8f
157 Error -10 (bad UTF-8 string) offset=0 reason=18
158 \xfc\x80\x80\x80\x80\x8f
159 Error -10 (bad UTF-8 string) offset=0 reason=19
160 \x80
161 Error -10 (bad UTF-8 string) offset=0 reason=20
162 \xfe
163 Error -10 (bad UTF-8 string) offset=0 reason=21
164 \xff
165 Error -10 (bad UTF-8 string) offset=0 reason=21
166 \xef\xb7\x90
167 Error -10 (bad UTF-8 string) offset=0 reason=22
168
169 /badutf/8
170 \xfb\x80\x80\x80\x80
171 Error -10 (bad UTF-8 string) offset=0 reason=11
172 \xfd\x80\x80\x80\x80\x80
173 Error -10 (bad UTF-8 string) offset=0 reason=12
174 \xf7\xbf\xbf\xbf
175 Error -10 (bad UTF-8 string) offset=0 reason=13
176
177 /shortutf/8
178 \P\P\xdf
179 Error -25 (short UTF-8 string) offset=0 reason=1
180 \P\P\xef
181 Error -25 (short UTF-8 string) offset=0 reason=2
182 \P\P\xef\x80
183 Error -25 (short UTF-8 string) offset=0 reason=1
184 \P\P\xf7
185 Error -25 (short UTF-8 string) offset=0 reason=3
186 \P\P\xf7\x80
187 Error -25 (short UTF-8 string) offset=0 reason=2
188 \P\P\xf7\x80\x80
189 Error -25 (short UTF-8 string) offset=0 reason=1
190 \P\P\xfb
191 Error -25 (short UTF-8 string) offset=0 reason=4
192 \P\P\xfb\x80
193 Error -25 (short UTF-8 string) offset=0 reason=3
194 \P\P\xfb\x80\x80
195 Error -25 (short UTF-8 string) offset=0 reason=2
196 \P\P\xfb\x80\x80\x80
197 Error -25 (short UTF-8 string) offset=0 reason=1
198 \P\P\xfd
199 Error -25 (short UTF-8 string) offset=0 reason=5
200 \P\P\xfd\x80
201 Error -25 (short UTF-8 string) offset=0 reason=4
202 \P\P\xfd\x80\x80
203 Error -25 (short UTF-8 string) offset=0 reason=3
204 \P\P\xfd\x80\x80\x80
205 Error -25 (short UTF-8 string) offset=0 reason=2
206 \P\P\xfd\x80\x80\x80\x80
207 Error -25 (short UTF-8 string) offset=0 reason=1
208
209 /anything/8
210 \xc0\x80
211 Error -10 (bad UTF-8 string) offset=0 reason=15
212 \xc1\x8f
213 Error -10 (bad UTF-8 string) offset=0 reason=15
214 \xe0\x9f\x80
215 Error -10 (bad UTF-8 string) offset=0 reason=16
216 \xf0\x8f\x80\x80
217 Error -10 (bad UTF-8 string) offset=0 reason=17
218 \xf8\x87\x80\x80\x80
219 Error -10 (bad UTF-8 string) offset=0 reason=18
220 \xfc\x83\x80\x80\x80\x80
221 Error -10 (bad UTF-8 string) offset=0 reason=19
222 \xfe\x80\x80\x80\x80\x80
223 Error -10 (bad UTF-8 string) offset=0 reason=21
224 \xff\x80\x80\x80\x80\x80
225 Error -10 (bad UTF-8 string) offset=0 reason=21
226 \xc3\x8f
227 No match
228 \xe0\xaf\x80
229 No match
230 \xe1\x80\x80
231 No match
232 \xf0\x9f\x80\x80
233 No match
234 \xf1\x8f\x80\x80
235 No match
236 \xf8\x88\x80\x80\x80
237 Error -10 (bad UTF-8 string) offset=0 reason=11
238 \xf9\x87\x80\x80\x80
239 Error -10 (bad UTF-8 string) offset=0 reason=11
240 \xfc\x84\x80\x80\x80\x80
241 Error -10 (bad UTF-8 string) offset=0 reason=12
242 \xfd\x83\x80\x80\x80\x80
243 Error -10 (bad UTF-8 string) offset=0 reason=12
244 \?\xf8\x88\x80\x80\x80
245 No match
246 \?\xf9\x87\x80\x80\x80
247 No match
248 \?\xfc\x84\x80\x80\x80\x80
249 No match
250 \?\xfd\x83\x80\x80\x80\x80
251 No match
252
253 /noncharacter/8
254 \x{fffe}
255 Error -10 (bad UTF-8 string) offset=0 reason=22
256 \x{ffff}
257 Error -10 (bad UTF-8 string) offset=0 reason=22
258 \x{1fffe}
259 Error -10 (bad UTF-8 string) offset=0 reason=22
260 \x{1ffff}
261 Error -10 (bad UTF-8 string) offset=0 reason=22
262 \x{2fffe}
263 Error -10 (bad UTF-8 string) offset=0 reason=22
264 \x{2ffff}
265 Error -10 (bad UTF-8 string) offset=0 reason=22
266 \x{3fffe}
267 Error -10 (bad UTF-8 string) offset=0 reason=22
268 \x{3ffff}
269 Error -10 (bad UTF-8 string) offset=0 reason=22
270 \x{4fffe}
271 Error -10 (bad UTF-8 string) offset=0 reason=22
272 \x{4ffff}
273 Error -10 (bad UTF-8 string) offset=0 reason=22
274 \x{5fffe}
275 Error -10 (bad UTF-8 string) offset=0 reason=22
276 \x{5ffff}
277 Error -10 (bad UTF-8 string) offset=0 reason=22
278 \x{6fffe}
279 Error -10 (bad UTF-8 string) offset=0 reason=22
280 \x{6ffff}
281 Error -10 (bad UTF-8 string) offset=0 reason=22
282 \x{7fffe}
283 Error -10 (bad UTF-8 string) offset=0 reason=22
284 \x{7ffff}
285 Error -10 (bad UTF-8 string) offset=0 reason=22
286 \x{8fffe}
287 Error -10 (bad UTF-8 string) offset=0 reason=22
288 \x{8ffff}
289 Error -10 (bad UTF-8 string) offset=0 reason=22
290 \x{9fffe}
291 Error -10 (bad UTF-8 string) offset=0 reason=22
292 \x{9ffff}
293 Error -10 (bad UTF-8 string) offset=0 reason=22
294 \x{afffe}
295 Error -10 (bad UTF-8 string) offset=0 reason=22
296 \x{affff}
297 Error -10 (bad UTF-8 string) offset=0 reason=22
298 \x{bfffe}
299 Error -10 (bad UTF-8 string) offset=0 reason=22
300 \x{bffff}
301 Error -10 (bad UTF-8 string) offset=0 reason=22
302 \x{cfffe}
303 Error -10 (bad UTF-8 string) offset=0 reason=22
304 \x{cffff}
305 Error -10 (bad UTF-8 string) offset=0 reason=22
306 \x{dfffe}
307 Error -10 (bad UTF-8 string) offset=0 reason=22
308 \x{dffff}
309 Error -10 (bad UTF-8 string) offset=0 reason=22
310 \x{efffe}
311 Error -10 (bad UTF-8 string) offset=0 reason=22
312 \x{effff}
313 Error -10 (bad UTF-8 string) offset=0 reason=22
314 \x{ffffe}
315 Error -10 (bad UTF-8 string) offset=0 reason=22
316 \x{fffff}
317 Error -10 (bad UTF-8 string) offset=0 reason=22
318 \x{10fffe}
319 Error -10 (bad UTF-8 string) offset=0 reason=22
320 \x{10ffff}
321 Error -10 (bad UTF-8 string) offset=0 reason=22
322 \x{fdd0}
323 Error -10 (bad UTF-8 string) offset=0 reason=22
324 \x{fdd1}
325 Error -10 (bad UTF-8 string) offset=0 reason=22
326 \x{fdd2}
327 Error -10 (bad UTF-8 string) offset=0 reason=22
328 \x{fdd3}
329 Error -10 (bad UTF-8 string) offset=0 reason=22
330 \x{fdd4}
331 Error -10 (bad UTF-8 string) offset=0 reason=22
332 \x{fdd5}
333 Error -10 (bad UTF-8 string) offset=0 reason=22
334 \x{fdd6}
335 Error -10 (bad UTF-8 string) offset=0 reason=22
336 \x{fdd7}
337 Error -10 (bad UTF-8 string) offset=0 reason=22
338 \x{fdd8}
339 Error -10 (bad UTF-8 string) offset=0 reason=22
340 \x{fdd9}
341 Error -10 (bad UTF-8 string) offset=0 reason=22
342 \x{fdda}
343 Error -10 (bad UTF-8 string) offset=0 reason=22
344 \x{fddb}
345 Error -10 (bad UTF-8 string) offset=0 reason=22
346 \x{fddc}
347 Error -10 (bad UTF-8 string) offset=0 reason=22
348 \x{fddd}
349 Error -10 (bad UTF-8 string) offset=0 reason=22
350 \x{fdde}
351 Error -10 (bad UTF-8 string) offset=0 reason=22
352 \x{fddf}
353 Error -10 (bad UTF-8 string) offset=0 reason=22
354 \x{fde0}
355 Error -10 (bad UTF-8 string) offset=0 reason=22
356 \x{fde1}
357 Error -10 (bad UTF-8 string) offset=0 reason=22
358 \x{fde2}
359 Error -10 (bad UTF-8 string) offset=0 reason=22
360 \x{fde3}
361 Error -10 (bad UTF-8 string) offset=0 reason=22
362 \x{fde4}
363 Error -10 (bad UTF-8 string) offset=0 reason=22
364 \x{fde5}
365 Error -10 (bad UTF-8 string) offset=0 reason=22
366 \x{fde6}
367 Error -10 (bad UTF-8 string) offset=0 reason=22
368 \x{fde7}
369 Error -10 (bad UTF-8 string) offset=0 reason=22
370 \x{fde8}
371 Error -10 (bad UTF-8 string) offset=0 reason=22
372 \x{fde9}
373 Error -10 (bad UTF-8 string) offset=0 reason=22
374 \x{fdea}
375 Error -10 (bad UTF-8 string) offset=0 reason=22
376 \x{fdeb}
377 Error -10 (bad UTF-8 string) offset=0 reason=22
378 \x{fdec}
379 Error -10 (bad UTF-8 string) offset=0 reason=22
380 \x{fded}
381 Error -10 (bad UTF-8 string) offset=0 reason=22
382 \x{fdee}
383 Error -10 (bad UTF-8 string) offset=0 reason=22
384 \x{fdef}
385 Error -10 (bad UTF-8 string) offset=0 reason=22
386
387 /\x{100}/8DZ
388 ------------------------------------------------------------------
389 Bra
390 \x{100}
391 Ket
392 End
393 ------------------------------------------------------------------
394 Capturing subpattern count = 0
395 Options: utf
396 First char = \x{c4}
397 Need char = \x{80}
398
399 /\x{1000}/8DZ
400 ------------------------------------------------------------------
401 Bra
402 \x{1000}
403 Ket
404 End
405 ------------------------------------------------------------------
406 Capturing subpattern count = 0
407 Options: utf
408 First char = \x{e1}
409 Need char = \x{80}
410
411 /\x{10000}/8DZ
412 ------------------------------------------------------------------
413 Bra
414 \x{10000}
415 Ket
416 End
417 ------------------------------------------------------------------
418 Capturing subpattern count = 0
419 Options: utf
420 First char = \x{f0}
421 Need char = \x{80}
422
423 /\x{100000}/8DZ
424 ------------------------------------------------------------------
425 Bra
426 \x{100000}
427 Ket
428 End
429 ------------------------------------------------------------------
430 Capturing subpattern count = 0
431 Options: utf
432 First char = \x{f4}
433 Need char = \x{80}
434
435 /\x{10ffff}/8DZ
436 ------------------------------------------------------------------
437 Bra
438 \x{10ffff}
439 Ket
440 End
441 ------------------------------------------------------------------
442 Capturing subpattern count = 0
443 Options: utf
444 First char = \x{f4}
445 Need char = \x{bf}
446
447 /[\x{ff}]/8DZ
448 ------------------------------------------------------------------
449 Bra
450 \x{ff}
451 Ket
452 End
453 ------------------------------------------------------------------
454 Capturing subpattern count = 0
455 Options: utf
456 First char = \x{c3}
457 Need char = \x{bf}
458
459 /[\x{100}]/8DZ
460 ------------------------------------------------------------------
461 Bra
462 \x{100}
463 Ket
464 End
465 ------------------------------------------------------------------
466 Capturing subpattern count = 0
467 Options: utf
468 First char = \x{c4}
469 Need char = \x{80}
470
471 /\x80/8DZ
472 ------------------------------------------------------------------
473 Bra
474 \x{80}
475 Ket
476 End
477 ------------------------------------------------------------------
478 Capturing subpattern count = 0
479 Options: utf
480 First char = \x{c2}
481 Need char = \x{80}
482
483 /\xff/8DZ
484 ------------------------------------------------------------------
485 Bra
486 \x{ff}
487 Ket
488 End
489 ------------------------------------------------------------------
490 Capturing subpattern count = 0
491 Options: utf
492 First char = \x{c3}
493 Need char = \x{bf}
494
495 /\x{D55c}\x{ad6d}\x{C5B4}/DZ8
496 ------------------------------------------------------------------
497 Bra
498 \x{d55c}\x{ad6d}\x{c5b4}
499 Ket
500 End
501 ------------------------------------------------------------------
502 Capturing subpattern count = 0
503 Options: utf
504 First char = \x{ed}
505 Need char = \x{b4}
506 \x{D55c}\x{ad6d}\x{C5B4}
507 0: \x{d55c}\x{ad6d}\x{c5b4}
508
509 /\x{65e5}\x{672c}\x{8a9e}/DZ8
510 ------------------------------------------------------------------
511 Bra
512 \x{65e5}\x{672c}\x{8a9e}
513 Ket
514 End
515 ------------------------------------------------------------------
516 Capturing subpattern count = 0
517 Options: utf
518 First char = \x{e6}
519 Need char = \x{9e}
520 \x{65e5}\x{672c}\x{8a9e}
521 0: \x{65e5}\x{672c}\x{8a9e}
522
523 /\x{80}/DZ8
524 ------------------------------------------------------------------
525 Bra
526 \x{80}
527 Ket
528 End
529 ------------------------------------------------------------------
530 Capturing subpattern count = 0
531 Options: utf
532 First char = \x{c2}
533 Need char = \x{80}
534
535 /\x{084}/DZ8
536 ------------------------------------------------------------------
537 Bra
538 \x{84}
539 Ket
540 End
541 ------------------------------------------------------------------
542 Capturing subpattern count = 0
543 Options: utf
544 First char = \x{c2}
545 Need char = \x{84}
546
547 /\x{104}/DZ8
548 ------------------------------------------------------------------
549 Bra
550 \x{104}
551 Ket
552 End
553 ------------------------------------------------------------------
554 Capturing subpattern count = 0
555 Options: utf
556 First char = \x{c4}
557 Need char = \x{84}
558
559 /\x{861}/DZ8
560 ------------------------------------------------------------------
561 Bra
562 \x{861}
563 Ket
564 End
565 ------------------------------------------------------------------
566 Capturing subpattern count = 0
567 Options: utf
568 First char = \x{e0}
569 Need char = \x{a1}
570
571 /\x{212ab}/DZ8
572 ------------------------------------------------------------------
573 Bra
574 \x{212ab}
575 Ket
576 End
577 ------------------------------------------------------------------
578 Capturing subpattern count = 0
579 Options: utf
580 First char = \x{f0}
581 Need char = \x{ab}
582
583 /-- This one is here not because it's different to Perl, but because the way
584 the captured single-byte is displayed. (In Perl it becomes a character, and you
585 can't tell the difference.) --/
586
587 /X(\C)(.*)/8
588 X\x{1234}
589 0: X\x{1234}
590 1: \x{e1}
591 2: \x{88}\x{b4}
592 X\nabc
593 0: X\x{0a}abc
594 1: \x{0a}
595 2: abc
596
597 /-- This one is here because Perl gives out a grumbly error message (quite
598 correctly, but that messes up comparisons). --/
599
600 /a\Cb/8
601 *** Failers
602 No match
603 a\x{100}b
604 No match
605
606 /[^ab\xC0-\xF0]/8SDZ
607 ------------------------------------------------------------------
608 Bra
609 [\x00-`c-\xbf\xf1-\xff] (neg)
610 Ket
611 End
612 ------------------------------------------------------------------
613 Capturing subpattern count = 0
614 Options: utf
615 No first char
616 No need char
617 Subject length lower bound = 1
618 Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
619 \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
620 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4
621 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y
622 Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
623 \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0
624 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf
625 \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee
626 \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd
627 \xfe \xff
628 \x{f1}
629 0: \x{f1}
630 \x{bf}
631 0: \x{bf}
632 \x{100}
633 0: \x{100}
634 \x{1000}
635 0: \x{1000}
636 *** Failers
637 0: *
638 \x{c0}
639 No match
640 \x{f0}
641 No match
642
643 /Ā{3,4}/8SDZ
644 ------------------------------------------------------------------
645 Bra
646 \x{100}{3}
647 \x{100}?
648 Ket
649 End
650 ------------------------------------------------------------------
651 Capturing subpattern count = 0
652 Options: utf
653 First char = \x{c4}
654 Need char = \x{80}
655 Subject length lower bound = 3
656 No set of starting bytes
657 \x{100}\x{100}\x{100}\x{100\x{100}
658 0: \x{100}\x{100}\x{100}
659
660 /(\x{100}+|x)/8SDZ
661 ------------------------------------------------------------------
662 Bra
663 CBra 1
664 \x{100}+
665 Alt
666 x
667 Ket
668 Ket
669 End
670 ------------------------------------------------------------------
671 Capturing subpattern count = 1
672 Options: utf
673 No first char
674 No need char
675 Subject length lower bound = 1
676 Starting byte set: x \xc4
677
678 /(\x{100}*a|x)/8SDZ
679 ------------------------------------------------------------------
680 Bra
681 CBra 1
682 \x{100}*+
683 a
684 Alt
685 x
686 Ket
687 Ket
688 End
689 ------------------------------------------------------------------
690 Capturing subpattern count = 1
691 Options: utf
692 No first char
693 No need char
694 Subject length lower bound = 1
695 Starting byte set: a x \xc4
696
697 /(\x{100}{0,2}a|x)/8SDZ
698 ------------------------------------------------------------------
699 Bra
700 CBra 1
701 \x{100}{0,2}
702 a
703 Alt
704 x
705 Ket
706 Ket
707 End
708 ------------------------------------------------------------------
709 Capturing subpattern count = 1
710 Options: utf
711 No first char
712 No need char
713 Subject length lower bound = 1
714 Starting byte set: a x \xc4
715
716 /(\x{100}{1,2}a|x)/8SDZ
717 ------------------------------------------------------------------
718 Bra
719 CBra 1
720 \x{100}
721 \x{100}{0,1}
722 a
723 Alt
724 x
725 Ket
726 Ket
727 End
728 ------------------------------------------------------------------
729 Capturing subpattern count = 1
730 Options: utf
731 No first char
732 No need char
733 Subject length lower bound = 1
734 Starting byte set: x \xc4
735
736 /\x{100}/8DZ
737 ------------------------------------------------------------------
738 Bra
739 \x{100}
740 Ket
741 End
742 ------------------------------------------------------------------
743 Capturing subpattern count = 0
744 Options: utf
745 First char = \x{c4}
746 Need char = \x{80}
747
748 /a\x{100}\x{101}*/8DZ
749 ------------------------------------------------------------------
750 Bra
751 a\x{100}
752 \x{101}*
753 Ket
754 End
755 ------------------------------------------------------------------
756 Capturing subpattern count = 0
757 Options: utf
758 First char = 'a'
759 Need char = \x{80}
760
761 /a\x{100}\x{101}+/8DZ
762 ------------------------------------------------------------------
763 Bra
764 a\x{100}
765 \x{101}+
766 Ket
767 End
768 ------------------------------------------------------------------
769 Capturing subpattern count = 0
770 Options: utf
771 First char = 'a'
772 Need char = \x{81}
773
774 /[^\x{c4}]/DZ
775 ------------------------------------------------------------------
776 Bra
777 [^\x{c4}]
778 Ket
779 End
780 ------------------------------------------------------------------
781 Capturing subpattern count = 0
782 No options
783 No first char
784 No need char
785
786 /[\x{100}]/8DZ
787 ------------------------------------------------------------------
788 Bra
789 \x{100}
790 Ket
791 End
792 ------------------------------------------------------------------
793 Capturing subpattern count = 0
794 Options: utf
795 First char = \x{c4}
796 Need char = \x{80}
797 \x{100}
798 0: \x{100}
799 Z\x{100}
800 0: \x{100}
801 \x{100}Z
802 0: \x{100}
803 *** Failers
804 No match
805
806 /[\xff]/DZ8
807 ------------------------------------------------------------------
808 Bra
809 \x{ff}
810 Ket
811 End
812 ------------------------------------------------------------------
813 Capturing subpattern count = 0
814 Options: utf
815 First char = \x{c3}
816 Need char = \x{bf}
817 >\x{ff}<
818 0: \x{ff}
819
820 /[^\xff]/8DZ
821 ------------------------------------------------------------------
822 Bra
823 [^\x{ff}]
824 Ket
825 End
826 ------------------------------------------------------------------
827 Capturing subpattern count = 0
828 Options: utf
829 No first char
830 No need char
831
832 /\x{100}abc(xyz(?1))/8DZ
833 ------------------------------------------------------------------
834 Bra
835 \x{100}abc
836 CBra 1
837 xyz
838 Recurse
839 Ket
840 Ket
841 End
842 ------------------------------------------------------------------
843 Capturing subpattern count = 1
844 Options: utf
845 First char = \x{c4}
846 Need char = 'z'
847
848 /a\x{1234}b/P8
849 a\x{1234}b
850 0: a\x{1234}b
851
852 /\777/8I
853 Capturing subpattern count = 0
854 Options: utf
855 First char = \x{c7}
856 Need char = \x{bf}
857 \x{1ff}
858 0: \x{1ff}
859 \777
860 0: \x{1ff}
861
862 /\x{100}+\x{200}/8DZ
863 ------------------------------------------------------------------
864 Bra
865 \x{100}++
866 \x{200}
867 Ket
868 End
869 ------------------------------------------------------------------
870 Capturing subpattern count = 0
871 Options: utf
872 First char = \x{c4}
873 Need char = \x{80}
874
875 /\x{100}+X/8DZ
876 ------------------------------------------------------------------
877 Bra
878 \x{100}++
879 X
880 Ket
881 End
882 ------------------------------------------------------------------
883 Capturing subpattern count = 0
884 Options: utf
885 First char = \x{c4}
886 Need char = 'X'
887
888 /^[\QĀ\E-\QŐ\E/BZ8
889 Failed: missing terminating ] for character class at offset 15
890
891 /-- This tests the stricter UTF-8 check according to RFC 3629. --/
892
893 /X/8
894 \x{0}\x{d7ff}\x{e000}\x{10ffff}
895 Error -10 (bad UTF-8 string) offset=7 reason=22
896 \x{d800}
897 Error -10 (bad UTF-8 string) offset=0 reason=14
898 \x{d800}\?
899 No match
900 \x{da00}
901 Error -10 (bad UTF-8 string) offset=0 reason=14
902 \x{da00}\?
903 No match
904 \x{dfff}
905 Error -10 (bad UTF-8 string) offset=0 reason=14
906 \x{dfff}\?
907 No match
908 \x{110000}
909 Error -10 (bad UTF-8 string) offset=0 reason=13
910 \x{110000}\?
911 No match
912 \x{2000000}
913 Error -10 (bad UTF-8 string) offset=0 reason=11
914 \x{2000000}\?
915 No match
916 \x{7fffffff}
917 Error -10 (bad UTF-8 string) offset=0 reason=12
918 \x{7fffffff}\?
919 No match
920
921 /(*UTF8)\x{1234}/
922 abcd\x{1234}pqr
923 0: \x{1234}
924
925 /(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
926 Capturing subpattern count = 0
927 Options: bsr_unicode utf
928 Forced newline sequence: CRLF
929 First char = 'a'
930 Need char = 'b'
931
932 /\h/SI8
933 Capturing subpattern count = 0
934 Options: utf
935 No first char
936 No need char
937 Subject length lower bound = 1
938 Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3
939 ABC\x{09}
940 0: \x{09}
941 ABC\x{20}
942 0:
943 ABC\x{a0}
944 0: \x{a0}
945 ABC\x{1680}
946 0: \x{1680}
947 ABC\x{180e}
948 0: \x{180e}
949 ABC\x{2000}
950 0: \x{2000}
951 ABC\x{202f}
952 0: \x{202f}
953 ABC\x{205f}
954 0: \x{205f}
955 ABC\x{3000}
956 0: \x{3000}
957
958 /\v/SI8
959 Capturing subpattern count = 0
960 Options: utf
961 No first char
962 No need char
963 Subject length lower bound = 1
964 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
965 ABC\x{0a}
966 0: \x{0a}
967 ABC\x{0b}
968 0: \x{0b}
969 ABC\x{0c}
970 0: \x{0c}
971 ABC\x{0d}
972 0: \x{0d}
973 ABC\x{85}
974 0: \x{85}
975 ABC\x{2028}
976 0: \x{2028}
977
978 /\h*A/SI8
979 Capturing subpattern count = 0
980 Options: utf
981 No first char
982 Need char = 'A'
983 Subject length lower bound = 1
984 Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
985 CDBABC
986 0: A
987
988 /\v+A/SI8
989 Capturing subpattern count = 0
990 Options: utf
991 No first char
992 Need char = 'A'
993 Subject length lower bound = 2
994 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
995
996 /\s?xxx\s/8SI
997 Capturing subpattern count = 0
998 Options: utf
999 No first char
1000 Need char = 'x'
1001 Subject length lower bound = 4
1002 Starting byte set: \x09 \x0a \x0c \x0d \x20 x
1003
1004 /\sxxx\s/I8ST1
1005 Capturing subpattern count = 0
1006 Options: utf
1007 No first char
1008 Need char = 'x'
1009 Subject length lower bound = 5
1010 Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2
1011 AB\x{85}xxx\x{a0}XYZ
1012 0: \x{85}xxx\x{a0}
1013 AB\x{a0}xxx\x{85}XYZ
1014 0: \x{a0}xxx\x{85}
1015
1016 /\S \S/I8ST1
1017 Capturing subpattern count = 0
1018 Options: utf
1019 No first char
1020 Need char = ' '
1021 Subject length lower bound = 3
1022 Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e
1023 \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
1024 \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
1025 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e
1026 f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3
1027 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2
1028 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1
1029 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0
1030 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff
1031 \x{a2} \x{84}
1032 0: \x{a2} \x{84}
1033 A Z
1034 0: A Z
1035
1036 /a+/8
1037 a\x{123}aa\>1
1038 0: aa
1039 a\x{123}aa\>2
1040 Error -11 (bad UTF-8 offset)
1041 a\x{123}aa\>3
1042 0: aa
1043 a\x{123}aa\>4
1044 0: a
1045 a\x{123}aa\>5
1046 No match
1047 a\x{123}aa\>6
1048 Error -24 (bad offset value)
1049
1050 /\x{1234}+/iS8I
1051 Capturing subpattern count = 0
1052 Options: caseless utf
1053 No first char
1054 No need char
1055 Subject length lower bound = 1
1056 Starting byte set: \xe1
1057
1058 /\x{1234}+?/iS8I
1059 Capturing subpattern count = 0
1060 Options: caseless utf
1061 No first char
1062 No need char
1063 Subject length lower bound = 1
1064 Starting byte set: \xe1
1065
1066 /\x{1234}++/iS8I
1067 Capturing subpattern count = 0
1068 Options: caseless utf
1069 No first char
1070 No need char
1071 Subject length lower bound = 1
1072 Starting byte set: \xe1
1073
1074 /\x{1234}{2}/iS8I
1075 Capturing subpattern count = 0
1076 Options: caseless utf
1077 No first char
1078 No need char
1079 Subject length lower bound = 2
1080 Starting byte set: \xe1
1081
1082 /[^\x{c4}]/8DZ
1083 ------------------------------------------------------------------
1084 Bra
1085 [^\x{c4}]
1086 Ket
1087 End
1088 ------------------------------------------------------------------
1089 Capturing subpattern count = 0
1090 Options: utf
1091 No first char
1092 No need char
1093
1094 /X+\x{200}/8DZ
1095 ------------------------------------------------------------------
1096 Bra
1097 X++
1098 \x{200}
1099 Ket
1100 End
1101 ------------------------------------------------------------------
1102 Capturing subpattern count = 0
1103 Options: utf
1104 First char = 'X'
1105 Need char = \x{80}
1106
1107 /\R/SI8
1108 Capturing subpattern count = 0
1109 Options: utf
1110 No first char
1111 No need char
1112 Subject length lower bound = 1
1113 Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
1114
1115 /\777/8DZ
1116 ------------------------------------------------------------------
1117 Bra
1118 \x{1ff}
1119 Ket
1120 End
1121 ------------------------------------------------------------------
1122 Capturing subpattern count = 0
1123 Options: utf
1124 First char = \x{c7}
1125 Need char = \x{bf}
1126
1127 /\w+\x{C4}/8BZ
1128 ------------------------------------------------------------------
1129 Bra
1130 \w++
1131 \x{c4}
1132 Ket
1133 End
1134 ------------------------------------------------------------------
1135 a\x{C4}\x{C4}
1136 0: a\x{c4}
1137
1138 /\w+\x{C4}/8BZT1
1139 ------------------------------------------------------------------
1140 Bra
1141 \w+
1142 \x{c4}
1143 Ket
1144 End
1145 ------------------------------------------------------------------
1146 a\x{C4}\x{C4}
1147 0: a\x{c4}\x{c4}
1148
1149 /\W+\x{C4}/8BZ
1150 ------------------------------------------------------------------
1151 Bra
1152 \W+
1153 \x{c4}
1154 Ket
1155 End
1156 ------------------------------------------------------------------
1157 !\x{C4}
1158 0: !\x{c4}
1159
1160 /\W+\x{C4}/8BZT1
1161 ------------------------------------------------------------------
1162 Bra
1163 \W++
1164 \x{c4}
1165 Ket
1166 End
1167 ------------------------------------------------------------------
1168 !\x{C4}
1169 0: !\x{c4}
1170
1171 /\W+\x{A1}/8BZ
1172 ------------------------------------------------------------------
1173 Bra
1174 \W+
1175 \x{a1}
1176 Ket
1177 End
1178 ------------------------------------------------------------------
1179 !\x{A1}
1180 0: !\x{a1}
1181
1182 /\W+\x{A1}/8BZT1
1183 ------------------------------------------------------------------
1184 Bra
1185 \W+
1186 \x{a1}
1187 Ket
1188 End
1189 ------------------------------------------------------------------
1190 !\x{A1}
1191 0: !\x{a1}
1192
1193 /X\s+\x{A0}/8BZ
1194 ------------------------------------------------------------------
1195 Bra
1196 X
1197 \s++
1198 \x{a0}
1199 Ket
1200 End
1201 ------------------------------------------------------------------
1202 X\x20\x{A0}\x{A0}
1203 0: X \x{a0}
1204
1205 /X\s+\x{A0}/8BZT1
1206 ------------------------------------------------------------------
1207 Bra
1208 X
1209 \s+
1210 \x{a0}
1211 Ket
1212 End
1213 ------------------------------------------------------------------
1214 X\x20\x{A0}\x{A0}
1215 0: X \x{a0}\x{a0}
1216
1217 /\S+\x{A0}/8BZ
1218 ------------------------------------------------------------------
1219 Bra
1220 \S+
1221 \x{a0}
1222 Ket
1223 End
1224 ------------------------------------------------------------------
1225 X\x{A0}\x{A0}
1226 0: X\x{a0}\x{a0}
1227
1228 /\S+\x{A0}/8BZT1
1229 ------------------------------------------------------------------
1230 Bra
1231 \S++
1232 \x{a0}
1233 Ket
1234 End
1235 ------------------------------------------------------------------
1236 X\x{A0}\x{A0}
1237 0: X\x{a0}
1238
1239 /\x{a0}+\s!/8BZ
1240 ------------------------------------------------------------------
1241 Bra
1242 \x{a0}++
1243 \s
1244 !
1245 Ket
1246 End
1247 ------------------------------------------------------------------
1248 \x{a0}\x20!
1249 0: \x{a0} !
1250
1251 /\x{a0}+\s!/8BZT1
1252 ------------------------------------------------------------------
1253 Bra
1254 \x{a0}+
1255 \s
1256 !
1257 Ket
1258 End
1259 ------------------------------------------------------------------
1260 \x{a0}\x20!
1261 0: \x{a0} !
1262
1263 /A/8
1264 \x{ff000041}
1265 ** Character \x{ff000041} is greater than 0x7fffffff and so cannot be converted to UTF-8
1266 \x{7f000041}
1267 Error -10 (bad UTF-8 string) offset=0 reason=12
1268
1269 /-- End of testinput15 --/

  ViewVC Help
Powered by ViewVC 1.1.5