Defined Types

Overview

The sole purpose of the typedefs RKCompileErrorCode, RKCompileOption, RKMatchErrorCode, and RKMatchOption is to provide Foundation naming convention equivalents on top of the corresponding PCRE library values. None of the typedefs or RegexKit methods modify the original PCRE values, in fact the values as defined in the PCRE library pcre.h header file can still be used in the RegexKits equivalent methods.

The reasoning behind this is to facilitate later versions of the PCRE library, which may define additional options or error codes. Since the RegexKit includes the pcre.h from the linked against PCRE library, the pcre.h values may be used until their equivalents can be updated in RegexKitTypes.h.

RKBuildConfig

A bitmap of flags representing the configuration options of the PCRE library when it was initially built. Some options may represent default values, others may represent features that can not be altered or added at run time. If a required feature is missing then the underlying PCRE library that RKRegex is linked against will have to be changed. This will most likely require rebuilding the PCRE library and RegexKit framework from the source with the desired configuration options.

typedef enum {

RKBuildConfigNoOptions

RKBuildConfigUTF8

1 << 0,

RKBuildConfigUnicodeProperties

1 << 1,

RKBuildConfigNewlineDefault

0x00000000,

RKBuildConfigNewlineCR

0x00100000,

RKBuildConfigNewlineLF

0x00200000,

RKBuildConfigNewlineCRLF

0x00300000,

RKBuildConfigNewlineAny

0x00400000,

RKBuildConfigNewlineAnyCRLF

0x00500000,

RKBuildConfigNewlineMask

0x00700000,

RKBuildConfigBackslashRAnyCRLR

1 << 23,

RKBuildConfigBackslashRUnicode

1 << 24

} RKBuildConfig;

RKBuildConfigNoOptions

No build config options specified.

RKBuildConfigUTF8

Set if the PCRE library was compiled with UTF-8 support. This feature is normally enabled for the RegexKit framework by default. See UTF-8 Support and UTF-8 and Unicode Property Support for more information.

RKBuildConfigUnicodeProperties

Set if the PCRE library was compiled with Unicode Properties support, enabling the regular expression pattern escapes \P, \p, and \X. This feature is normally enabled for the RegexKit framework by default. See Unicode Character Property Support and UTF-8 and Unicode Property Support for more information.

RKBuildConfigNewlineDefault

The default character sequence. See Code Value of Newline for more information.

RKBuildConfigNewlineCR

The character 13 (carriage return, CR) is the default end of line character. See Code Value of Newline for more information.

RKBuildConfigNewlineLF

The character 10 (linefeed, LF) is the default end of line character. See Code Value of Newline for more information.

RKBuildConfigNewlineCRLF

The character sequence 13 (carriage return, CR), 10 (linefeed, LF) is the default end of line character sequence. See Code Value of Newline for more information.

RKBuildConfigNewlineAny

Any valid Unicode newline sequence is the default end of line. See Code Value of Newline for more information.

RKBuildConfigNewlineAnyCRLF

The default end of line character sequence is a combination of RKBuildConfigNewlineCR, RKBuildConfigNewlineLF, and RKBuildConfigNewlineCRLF. See Code Value of Newline for more information.

RKBuildConfigNewlineMask

A bitmask to extract only the newline setting. See Code Value of Newline and Newlines for more information.

RKBuildConfigBackslashRAnyCRLR

The regular expression escape sequence \R matches only CR, LF, or CRLF.

RKBuildConfigBackslashRUnicode

The regular expression escape sequence \R matches any Unicode line ending sequence.

RegexKitTypes.h

RKCompileErrorCode

The error reported by the PCRE library when attempting to compile a regular expression.

typedef enum {

RKCompileErrorNoError

RKCompileErrorEscapeAtEndOfPattern

RKCompileErrorByteEscapeAtEndOfPattern

RKCompileErrorUnrecognizedCharacterFollowingEscape

RKCompileErrorNumbersOutOfOrder

RKCompileErrorNumbersToBig

RKCompileErrorMissingTerminatorForCharacterClass

RKCompileErrorInvalidEscapeInCharacterClass

RKCompileErrorRangeOutOfOrderInCharacterClass

RKCompileErrorNothingToRepeat

RKCompileErrorInternalErrorUnexpectedRepeat

11,

RKCompileErrorUnrecognizedCharacterAfterOption

12,

RKCompileErrorPOSIXNamedClassOutsideOfClass

13,

RKCompileErrorMissingParentheses

14,

RKCompileErrorReferenceToNonExistentSubpattern

15,

RKCompileErrorErrorOffsetPassedAsNull

16,

RKCompileErrorUnknownOptionBits

17,

RKCompileErrorMissingParenthesesAfterComment

18,

RKCompileErrorRegexTooLarge

20,

RKCompileErrorNoMemory

21,

RKCompileErrorUnmatchedParentheses

22,

RKCompileErrorInternalCodeOverflow

23,

RKCompileErrorUnrecognizedCharacterAfterNamedSubppatern

24,

RKCompileErrorLookbehindAssertionNotFixedLength

25,

RKCompileErrorMalformedNameOrNumberAfterSubpattern

26,

RKCompileErrorConditionalGroupContainsMoreThanTwoBranches

27,

RKCompileErrorAssertionExpectedAfterCondition

28,

RKCompileErrorMissingEndParentheses

29,

RKCompileErrorUnknownPOSIXClassName

30,

RKCompileErrorPOSIXCollatingNotSupported

31,

RKCompileErrorMissingUTF8Support

32,

RKCompileErrorHexCharacterValueTooLarge

34,

RKCompileErrorInvalidCondition

35,

RKCompileErrorNotAllowedInLookbehindAssertion

36,

RKCompileErrorNotSupported

37,

RKCompileErrorCalloutExceedsMaximumAllowed

38,

RKCompileErrorMissingParenthesesAfterCallout

39,

RKCompileErrorRecursiveInfinitLoop

40,

RKCompileErrorUnrecognizedCharacterAfterNamedPattern

41,

RKCompileErrorSubpatternNameMissingTerminator

42,

RKCompileErrorDuplicateSubpatternNames

43,

RKCompileErrorInvalidUTF8String

44,

RKCompileErrorMissingUnicodeSupport

45,

RKCompileErrorMalformedUnicodeProperty

46,

RKCompileErrorUnknownPropertyAfterUnicodeCharacter

47,

RKCompileErrorSubpatternNameTooLong

48,

RKCompileErrorTooManySubpatterns

49,

RKCompileErrorRepeatedSubpatternTooLong

50,

RKCompileErrorIllegalOctalValueOutsideUTF8

51,

RKCompileErrorInternalOverranCompilingWorkspace

52,

RKCompileErrorInternalReferencedSubpatternNotFound

53,

RKCompileErrorDEFINEGroupContainsMoreThanOneBranch

54,

RKCompileErrorRepeatingDEFINEGroupNotAllowed

55,

RKCompileErrorInconsistentNewlineOptions

56,

RKCompileErrorReferenceMustBeNonZeroNumberOrBraced

57,

RKCompileErrorRelativeSubpatternNumberMustNotBeZero

} RKCompileErrorCode;

RKCompileErrorNoError

No error.

RKCompileErrorEscapeAtEndOfPattern

\ at end of pattern.

RKCompileErrorByteEscapeAtEndOfPattern

\c at end of pattern.

RKCompileErrorUnrecognizedCharacterFollowingEscape

Unrecognized character follows \.

RKCompileErrorNumbersOutOfOrder

Numbers out of order in {} quantifier.

RKCompileErrorNumbersToBig

Number too big in {} quantifier.

RKCompileErrorMissingTerminatorForCharacterClass

Missing terminating ] for character class.

RKCompileErrorInvalidEscapeInCharacterClass

Invalid escape sequence in character class.

RKCompileErrorRangeOutOfOrderInCharacterClass

Range out of order in character class.

RKCompileErrorNothingToRepeat

Nothing to repeat.

RKCompileErrorInternalErrorUnexpectedRepeat

Internal error, unexpected repeat.

RKCompileErrorUnrecognizedCharacterAfterOption

Unrecognized character after (?.

RKCompileErrorPOSIXNamedClassOutsideOfClass

POSIX named classes are supported only within a class.

RKCompileErrorMissingParentheses

Missing ).

RKCompileErrorReferenceToNonExistentSubpattern

Reference to non-existent subpattern.

RKCompileErrorErrorOffsetPassedAsNull

Internal error, erroffset passed as NULL.

RKCompileErrorUnknownOptionBits

Unknown RKCompileOption option bit(s) set.

RKCompileErrorMissingParenthesesAfterComment

Missing ) after comment.

RKCompileErrorRegexTooLarge

Regular expression too large.

RKCompileErrorNoMemory

Memory allocation failure.

RKCompileErrorUnmatchedParentheses

Unmatched parentheses.

RKCompileErrorInternalCodeOverflow

Internal error, code overflow.

RKCompileErrorUnrecognizedCharacterAfterNamedSubppatern

Unrecognized character after (?<.

RKCompileErrorLookbehindAssertionNotFixedLength

Lookbehind assertion is not fixed length.

RKCompileErrorMalformedNameOrNumberAfterSubpattern

Malformed number or name after (?(.

RKCompileErrorConditionalGroupContainsMoreThanTwoBranches

Conditional group contains more than two branches.

RKCompileErrorAssertionExpectedAfterCondition

Assertion expected after (?(.

RKCompileErrorMissingEndParentheses

(?R or (?digits must be followed by ).

RKCompileErrorUnknownPOSIXClassName

Unknown POSIX class name.

RKCompileErrorPOSIXCollatingNotSupported

POSIX collating elements are not supported.

RKCompileErrorMissingUTF8Support

The PCRE library was not built with UTF-8 support. See RKBuildConfigUTF8.

RKCompileErrorHexCharacterValueTooLarge

Character value in \x{...} sequence is too large.

RKCompileErrorInvalidCondition

Invalid condition (?(0).

RKCompileErrorNotAllowedInLookbehindAssertion

\C not allowed in lookbehind assertion.

RKCompileErrorNotSupported

PCRE does not support \L, \l, \N, \U, or \u.

RKCompileErrorCalloutExceedsMaximumAllowed

Number after (?C is > 255.

RKCompileErrorMissingParenthesesAfterCallout

closing ) for (?C expected.

RKCompileErrorRecursiveInfinitLoop

Recursive call could loop indefinitely.

RKCompileErrorUnrecognizedCharacterAfterNamedPattern

Unrecognized character after (?P.

RKCompileErrorSubpatternNameMissingTerminator

Syntax error in subpattern name (missing terminator).

RKCompileErrorDuplicateSubpatternNames

Two named subpatterns have the same name. See RKCompileDupNames.

RKCompileErrorInvalidUTF8String

Invalid UTF-8 string.

RKCompileErrorMissingUnicodeSupport

The PCRE library was not built with Unicode support. \P, \p, and \X are invalid. See RKBuildConfigUnicodeProperties.

RKCompileErrorMalformedUnicodeProperty

Malformed \P or \p sequence.

RKCompileErrorUnknownPropertyAfterUnicodeCharacter

Unknown property name after \P or \p.

RKCompileErrorSubpatternNameTooLong

Subpattern name is too long (maximum 32 characters).

RKCompileErrorTooManySubpatterns

Too many named subpatterns (maximum 10,000).

RKCompileErrorRepeatedSubpatternTooLong

Repeated subpattern is too long.

RKCompileErrorIllegalOctalValueOutsideUTF8

Octal value is greater than \377 (not in UTF-8 mode).

RKCompileErrorInternalOverranCompilingWorkspace

Internal error, overran compiling workspace.

RKCompileErrorInternalReferencedSubpatternNotFound

Internal error, previously-checked referenced subpattern not found.

RKCompileErrorDEFINEGroupContainsMoreThanOneBranch

DEFINE group contains more than one branch.

RKCompileErrorRepeatingDEFINEGroupNotAllowed

Repeating a DEFINE group is not allowed.

RKCompileErrorInconsistentNewlineOptions

Inconsistent RKCompileNewlineMask options.

RKCompileErrorReferenceMustBeNonZeroNumberOrBraced

\g must be followed by a non-zero number or a braced name or number (ie, {name} or {0123}).

RKCompileErrorRelativeSubpatternNumberMustNotBeZero

The relative subpattern reference parameter to (?+ , (?- , (?(+ , or (?(- must be followed by a non-zero number.

RegexKitTypes.h

RKCompileOption

A collection of bitmask options that can be combined together and passed via the options argument of regexWithRegexString:options: or initWithRegexString:options:.

typedef enum {

1 << 0,

1 << 1,

1 << 2,

1 << 3,

1 << 4,

RKCompileDollarEndOnly

1 << 5,

RKCompileExtra

1 << 6,

RKCompileUngreedy

1 << 9,

RKCompileUTF8

1 << 11,

RKCompileNoAutoCapture

1 << 12,

1 << 13,

1 << 14,

1 << 18,

1 << 19,

RKCompileBackslashRAnyCRLR

1 << 23,

RKCompileBackslashRUnicode

1 << 24,

RKCompileAllOptions

RKCompileUnsupported

(RKCompileAutoCallout),

RKCompileNewlineDefault

0x00000000,

0x00100000,

0x00200000,

0x00300000,

0x00400000,

RKCompileNewlineAnyCRLF

0x00500000,

RKCompileNewlineMask

0x00700000,

RKCompileNewlineShift

} RKCompileOption;

RKCompileNoOptions

No specific options.

RKCompileCaseless

If this bit is set, letters in the pattern match both upper and lower case letters. It is equivalent to Perl's /i option, and it can be changed within a pattern by a ?i option setting. In UTF-8 mode, PCRE always understands the concept of case for characters whose values are less than 128, so caseless matching is always possible. For characters with higher values, the concept of case is supported if the PCRE library is built with Unicode property support, but not otherwise. If you want to use caseless matching for characters 128 and above, you must ensure that the PCRE library is built with Unicode property support as well as with UTF-8 support. See RKBuildConfig.

RKCompileMultiline

By default, PCRE treats the subject string as consisting of a single line of characters (even if it actually contains newlines). The start of line metacharacter ^ matches only at the start of the string, while the end of line metacharacter $ matches only at the end of the string, or before a terminating newline (unless RKCompileDollarEndOnly is set). This is the same as Perl.

When RKCompileMultiline is set, the start of line and end of line constructs match immediately following or immediately before internal newlines in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m option, and it can be changed within a pattern by a ?m option setting. If there are no newlines in a subject string, or no occurrences of ^ or $ in a pattern, setting RKCompileMultiline has no effect.

RKCompileDotAll

If this bit is set, a dot metacharacter in the pattern matches all characters, including those that indicate newline. Without it, a dot does not match when the current position is at a newline. This option is equivalent to Perl's /s option, and it can be changed within a pattern by a ?s option setting. A negative class such as [^a] always matches newline characters, independent of the setting of this option.

RKCompileExtended

If this bit is set, whitespace data characters in the pattern are totally ignored except when escaped or inside a character class. Whitespace does not include the VT character (code 11). In addition, characters between an unescaped # outside a character class and the next newline, inclusive, are also ignored. This is equivalent to Perl's /x option, and it can be changed within a pattern by a ?x option setting.

This option makes it possible to include comments inside complicated patterns. Note, however, that this applies only to data characters. Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.

RKCompileAnchored

If this bit is set, the pattern is forced to be "anchored", that is, it is constrained to match only at the first matching point in the string that is being searched (the "subject string"). This effect can also be achieved by appropriate constructs in the pattern itself, which is the only way to do it in Perl.

RKCompileDollarEndOnly

If this bit is set, a dollar metacharacter in the pattern matches only at the end of the subject string. Without this option, a dollar also matches immediately before a newline at the end of the string (but not before any other newlines). The RKCompileDollarEndOnly option is ignored if RKCompileMultiline is set. There is no equivalent to this option in Perl, and no way to set it within a pattern.

RKCompileExtra

This option was invented in order to turn on additional functionality of PCRE that is incompatible with Perl, but it is currently of very little use. When set, any backslash in a pattern that is followed by a letter that has no special meaning causes an error, thus reserving these combinations for future expansion. By default, as in Perl, a backslash followed by a letter with no special meaning is treated as a literal. (Perl can, however, be persuaded to give a warning for this.) There are at present no other features controlled by this option. It can also be set by a ?X option setting within a pattern.

RKCompileUngreedy

This option inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by ?. It is not compatible with Perl. It can also be set by a ?U option setting within the pattern.

RKCompileUTF8

This option causes PCRE to regard both the pattern and the subject as strings of UTF-8 characters instead of single-byte character strings. However, it is available only when the PCRE library is built to include UTF-8 support. If not, the use of this option returns an error. See UTF-8 and Unicode Property Support for more information.

RKCompileNoAutoCapture

If this option is set, it disables the use of numbered capturing parentheses in the pattern. Any opening parenthesis that is not followed by ? behaves as if it were followed by ?: but named parentheses can still be used for capturing (and they acquire numbers in the usual way). There is no equivalent of this option in Perl.

RKCompileNoUTF8Check

When RKCompileUTF8 is set, the validity of the pattern as a UTF-8 string is automatically checked. If an invalid UTF-8 sequence of bytes is found, initWithRegexString:options: returns an error. If you already know that your pattern is valid, and you want to skip this check for performance reasons, you can set the RKCompileNoUTF8Check option. When it is set, the effect of passing an invalid UTF-8 string as a pattern is undefined. It may cause your program to crash. Note that RKMatchNoUTF8Check can also be passed to getRanges:withCharacters:length:inRange:options: to suppress the UTF-8 validity checking of subject strings.

RKCompileAutoCallout

If this bit is set, initWithRegexString:options: automatically inserts callout items, all with number 255, before each pattern item. For discussion of the callout facility, see the PCRE Callouts documentation.

Important:

Use of callouts are unsupported and will raise a RKRegexUnsupportedException if used.

RKCompileFirstLine

If this option is set, an unanchored pattern is required to match before or at the first newline in the subject string, though the matched text may continue over the newline.

RKCompileDupNames

If this bit is set, names used to identify capturing subpatterns need not be unique. This can be helpful for certain types of regular expressions when it is known that only one instance of the named subpattern can ever be matched. See Named Subpatterns for more information. The option may also be set be specifying the (?J) option in the regular expression.

RKCompileBackslashRAnyCRLR

The escape sequence \R for the compiled regular expression will match only CR, LF, or CRLF. This option is mutually exclusive of RKCompileBackslashRUnicode.

RKCompileBackslashRUnicode

The escape sequence \R for the compiled regular expression will match any Unicode line ending sequence. This option is mutually exclusive of RKCompileBackslashRAnyCRLR.

RKCompileAllOptions

Contains a bitmask of all the defined options.

RKCompileUnsupported

Contains a bitmask of invalid options.

RKCompileNewlineDefault

The default newline sequence defined when the PCRE library was built.

RKCompileNewlineCR

The character 13 (carriage return, CR) is the default end of line character.

RKCompileNewlineLF

The character 10 (linefeed, LF) is the default end of line character.

RKCompileNewlineCRLF

The character sequence 13 (carriage return, CR), 10 (linefeed, LF) is the default end of line character sequence.

RKCompileNewlineAny

Any valid Unicode newline sequence is the default end of line.

RKCompileNewlineAnyCRLF

Any of the newline character sequences from RKCompileNewlineCR, RKCompileNewlineLF, or RKCompileNewlineCRLF will be used as a match for the end of line character sequence.

RKCompileNewlineMask

A bitmask to extract only the newline setting.

RKCompileNewlineShift

The number of bits that the newline type is shifted to the left.

RegexKitTypes.h

RKMatchErrorCode

Error codes that are returned by getRanges:withCharacters:length:inRange:options:.

Note:

All RKMatchErrorCode error codes are < 0.

typedef enum {

RKMatchErrorNoError

RKMatchErrorNoMatch

-1,

RKMatchErrorNull

-2,

RKMatchErrorBadOption

-3,

RKMatchErrorBadMagic

-4,

RKMatchErrorUnknownOpcode

-5,

RKMatchErrorNoMemory

-6,

RKMatchErrorNoSubstring

-7,

RKMatchErrorMatchLimit

-8,

RKMatchErrorCallout

-9,

RKMatchErrorBadUTF8

-10,

RKMatchErrorBadUTF8Offset

-11,

RKMatchErrorPartial

-12,

RKMatchErrorBadPartial

-13,

RKMatchErrorInternal

-14,

RKMatchErrorBadCount

-15,

RKMatchErrorRecursionLimit

-21,

RKMatchErrorNullWorkSpaceLimit

-22,

RKMatchErrorBadNewline

-23

} RKMatchErrorCode;

RKMatchErrorNoError

No error.

RKMatchErrorNoMatch

The subject string did not match the regular expression.

RKMatchErrorNull

This error is never returned by getRanges:withCharacters:length:inRange:options:.

RKMatchErrorBadOption

An unrecognized bit was set in the RKMatchOption options argument.

RKMatchErrorBadMagic

PCRE stores a 4-byte "magic number" at the start of the compiled code, to catch the case when it is passed an invalid pointer and to detect when a pattern that was compiled in an environment of one endianness is run in an environment with the other endianness. This is the error that PCRE gives when the magic number is not present.

RKMatchErrorUnknownOpcode

While running the pattern match, an unknown item was encountered in the compiled pattern. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern.

RKMatchErrorNoMemory

If a pattern contains back references and the internal matching buffers used by getRanges:withCharacters:length:inRange:options: are not big enough to hold the referenced substrings, then the PCRE library will allocate a block of memory at the start of matching to use for this purpose. If the PCRE library is unable to allocate the additional memory, this error is returned.

RKMatchErrorNoSubstring

This error is never returned by getRanges:withCharacters:length:inRange:options:.

RKMatchErrorMatchLimit

The internal backtracking limit was reached.

RKMatchErrorCallout

This error is never generated by getRanges:withCharacters:length:inRange:options: itself. It is provided for use by callout functions that want to yield a distinctive error code. See the PCRE Callouts documentation for details.

Important:

Use of callouts are unsupported and will raise a RKRegexUnsupportedException if used.

RKMatchErrorBadUTF8

A string that contains an invalid UTF-8 byte sequence was passed as a subject.

RKMatchErrorBadUTF8Offset

The UTF-8 byte sequence that was passed as a subject was valid, but the value of searchRange.location did not point to the beginning of a UTF-8 character.

RKMatchErrorPartial

The subject string did not match, but it did match partially. See the Partial Matching in PCRE documentation for details.

RKMatchErrorBadPartial

The RKMatchPartial option was used with a compiled pattern containing items that are not supported for partial matching. See the Partial Matching in PCRE documentation for details.

RKMatchErrorInternal

An unexpected internal error has occurred. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern.

RKMatchErrorBadCount

This error is never returned by getRanges:withCharacters:length:inRange:options:.

RKMatchErrorRecursionLimit

The internal recursion limit was reached.

RKMatchErrorNullWorkSpaceLimit

When a group that can match an empty substring is repeated with an unbounded upper limit, the subject position at the start of the group must be remembered, so that a test for an empty string can be made when the end of the group is reached. Some workspace is required for this; if it runs out, this error is given.

RKMatchErrorBadNewline

An invalid combination of RKMatchNewlineMask options was given.

RegexKitTypes.h

RKMatchOption

A collection of bitmask options that can be combined together and passed via the options argument of getRanges:withCharacters:length:inRange:options: or one of the other RKRegex matching methods.

typedef enum {

RKMatchNoOptions

RKMatchAnchored

1 << 4,

RKMatchNotBeginningOfLine

1 << 7,

1 << 8,

1 << 10,

1 << 13,

1 << 15,

RKMatchNewlineDefault

0x00000000,

0x00100000,

0x00200000,

0x00300000,

0x00400000,

RKMatchNewlineAnyCRLF

0x00500000,

RKMatchNewlineMask

0x00700000,

RKMatchBackslashRAnyCRLR

1 << 23,

RKMatchBackslashRUnicode

1 << 24

} RKMatchOption;

RKMatchNoOptions

No specific options

RKMatchAnchored

The RKMatchAnchored option limits getRanges:withCharacters:length:inRange:options: to matching at the first matching position. If the regular expression was compiled with RKCompileAnchored, or turned out to be anchored by virtue of its contents, it cannot be made unanchored at matching time.

RKMatchNotBeginningOfLine

This option specifies that first character of the subject string is not the beginning of a line, so the circumflex metacharacter should not match before it. Setting this without RKCompileMultiline (at compile time) causes circumflex never to match. This option affects only the behavior of the circumflex metacharacter. It does not affect \A.

RKMatchNotEndOfLine

This option specifies that the end of the subject string is not the end of a line, so the dollar metacharacter should not match it nor (except in RKCompileMultiline mode) a newline immediately before it. Setting this without RKCompileMultiline (at compile time) causes dollar never to match. This option affects only the behavior of the dollar metacharacter. It does not affect \Z or \z.

RKMatchNotEmpty

An empty string is not considered to be a valid match if this option is set. If there are alternatives in the regular expression, they are tried. If all the alternatives match the empty string, the entire match fails. For example, if the regular expression

a?b?

is applied to a string not beginning with "a" or "b", it matches the empty string at the start of the subject. With RKMatchNotEmpty set, this match is not valid, so PCRE searches further into the string for occurrences of "a" or "b".

Perl has no direct equivalent of RKMatchNotEmpty, but it does make a special case of a pattern match of the empty string within its split() function, and when using the /g modifier. It is possible to emulate Perl's behavior after matching a null string by first trying the match again at the same offset with RKMatchNotEmpty and RKMatchAnchored, and then if that fails by advancing the starting offset (see below) and trying an ordinary match again. There is some code that demonstrates how to do this in the pcredemo.c sample program.

RKMatchNoUTF8Check

When RKCompileUTF8 is set at compile time, the validity of the subject as a UTF-8 string is automatically checked when getRanges:withCharacters:length:inRange:options: is subsequently called. The value of searchRange location is also checked to ensure that it points to the start of a UTF-8 character. If an invalid UTF-8 sequence of bytes is found, getRanges:withCharacters:length:inRange:options: returns the error RKMatchErrorBadUTF8Offset. If searchRange location contains an invalid value, RKMatchErrorBadUTF8Offset is returned.

If you already know that your subject is valid, and you want to skip these checks for performance reasons, you can set the RKMatchNoUTF8Check option when calling getRanges:withCharacters:length:inRange:options:. You might want to do this for the second and subsequent calls to getRanges:withCharacters:length:inRange:options: if you are making repeated calls to find all the matches in a single subject string. However, you should be sure that the value of searchRange location points to the start of a UTF-8 character. When RKMatchNoUTF8Check is set, the effect of passing an invalid UTF-8 string as a charactersBuffer, or a value of searchRange location that does not point to the start of a UTF-8 character, is undefined. Your program may crash.

RKMatchPartial

This option turns on the partial matching feature. If the subject string fails to match the regular expression, but at some point during the matching process the end of the subject was reached (that is, the subject partially matches the pattern and the failure to match occurred only because there were not enough subject characters), getRanges:withCharacters:length:inRange:options: returns RKMatchErrorPartial instead of RKMatchErrorNoMatch. When RKMatchPartial is used, there are RK_C99(restrict)ions on what may appear in the pattern. These are discussed in Partial Matching in PCRE.

RKMatchNewlineDefault

The default newline sequence defined when the PCRE library was built.

RKMatchNewlineCR

The character 13 (carriage return, CR) is used as the end of line character during the match.

RKMatchNewlineLF

The character 10 (linefeed, LF) is used as the end of line character during the match.

RKMatchNewlineCRLF

The character sequence 13 (carriage return, CR), 10 (linefeed, LF) is used as the end of line character sequence during the match.

RKMatchNewlineAny

Any valid Unicode newline sequence is used as the end of line during the match.

RKMatchNewlineAnyCRLF

RKMatchNewlineCR, RKMatchNewlineLF, and RKMatchNewlineCRLF will be used as the end of line character sequence during the match.

RKMatchNewlineMask

A bitmask to extract only the newline setting.

RKMatchBackslashRAnyCRLR

The escape sequence \R in the compiled regular expression will match only CR, LF, or CRLF, temporarily over-riding the setting used when the regular expression was compiled. This option is mutually exclusive of RKMatchBackslashRUnicode.

RKMatchBackslashRUnicode

The escape sequence \R in the compiled regular expression will match any Unicode line ending sequence, temporarily over-riding the setting used when the regular expression was compiled. This option is mutually exclusive of RKMatchBackslashRAnyCRLR.

RegexKitTypes.h