117 |
Character classes |
Character classes |
118 |
----------------- |
----------------- |
119 |
|
|
120 |
OP_CLASS is used for a character class. It is followed by a 32-byte bit map |
OP_CLASS is used for a character class, and OP_NEGCLASS for a negated character |
121 |
containing a 1 bit for every character that is acceptable. The bits are counted |
class, provided there are at least two characters in the class. If there is |
122 |
from the least significant end of each byte. |
only one character, OP_CHARS is used for a positive class, and OP_NOT for a |
123 |
|
negative one. A set of repeating opcodes (OP_NOTSTAR etc.) are used for a |
124 |
|
repeated, negated, single-character class. |
125 |
|
|
126 |
|
Both OP_CLASS and OP_NEGCLASS are followed by a 32-byte bit map containing a 1 |
127 |
|
bit for every character that is acceptable. The bits are counted from the least |
128 |
|
significant end of each byte. The reason for having two opcodes is to cope with |
129 |
|
negated character classes when caseless matching is specified at run time but |
130 |
|
not at compile time. If it is specified at compile time, the bit map is built |
131 |
|
appropriately. This is the only time that a distinction is made between |
132 |
|
OP_CLASS and OP_NEGCLASS, when the bit map was built in a caseful manner but |
133 |
|
matching must be caseless. For OP_CLASS, a character matches if either of its |
134 |
|
cases is in the bit map, but for OP_NEGCLASS, both of them must be present. |
135 |
|
|
136 |
|
|
137 |
Back references |
Back references |
208 |
|
|
209 |
|
|
210 |
Philip Hazel |
Philip Hazel |
211 |
October 1997 |
December 1997 |