Engine: Hyperscan
Home | Engines | Reference | Improve this section
Languages
Features
The following features are supported:
- ✔ Flags
- ✔ Anchors
- ✔ Buffer Boundaries
- ✔ Word Boundaries
- ✔ Alternatives
- ✔ Wildcard
- ✔ Character Classes
- ✔ Posix Character Classes
- ✔ Negated Posix Character Classes
- ✔ Character Class Escapes
- ✔ Character Property Escapes
- ✔ Quantifiers
- ✔ Lazy Quantifiers
- ✔ Capturing Groups
- ✔ Named Capturing Groups
- ✔ Non-Capturing Groups
- ✔ Comments
- ✔ Modifiers
The following features are not supported:
- ❌ Text Segment Boundaries
- ❌ Continuation Escape
- ❌ Collating Elements
- ❌ Equivalence Classes
- ❌ Line Endings Escape
- ❌ Character Class Nested Set
- ❌ Character Class Intersection
- ❌ Character Class Union
- ❌ Character Class Subtraction
- ❌ Character Class Symmetric Difference
- ❌ Character Class Complement
- ❌ Quoted Characters
- ❌ Possessive Quantifiers
- ❌ Backreferences
- ❌ Line Comments
- ❌ Branch Reset
- ❌ Lookahead
- ❌ Lookbehind
- ❌ Non-Backtracking Expressions
- ❌ Recursion
- ❌ Conditional Expressions
- ❌ Subroutines
- ❌ Callouts
- ❌ Backtracking Control Verbs
Feature: Flags
Main article | Reference | Back to top | Improve this section: 1, 2
Flags control certain aspects of the matching behavior of a pattern.
Syntax
The following flags are supported:
i
— Ignore Case. Matches character classes using a case-insensitive comparison.m
— Multiline. Causes the anchors^
and$
to match the start and end of each line (respectively), rather than the start and end of the input.s
— Singleline. Causes the wildcard.
to match newline characters.x
— Extended Mode. Ignores whitespace in a pattern. Spaces must instead be represented by\s
or\
(an escaped space).
See Also
Feature: Anchors
Main article | Reference | Back to top | Improve this section: 1, 2
Anchors match the start or end of a line.
Syntax
^
— Matches the start of a line when them
(multiline) flag is set. Otherwise, matches the start of the input.$
— Matches the end of a line when them
(multiline) flag is set. Otherwise, matches the end of the input.
See Also
Feature: Buffer Boundaries
Main article | Reference | Back to top | Improve this section: 1, 2
A Buffer Boundary is an Atom that matches the start or the end of the input. This differs slightly from ^
and $
which can be affected by RegExp flags like m
.
Syntax
\A
— Matches the start of the input.\z
— Matches the end of the input.\Z
— A zero-width assertion consisting of an optional newline at the end of the buffer. Equivalent to(?=\n?\z)
.
See Also
Feature: Word Boundaries
Main article | Reference | Back to top | Improve this section: 1, 2
A Word Boundary is an Atom that matches the start or the end of a word.
Syntax
\b
— Matches the start or the end of a word.\B
— Matches when not at the start or the end of a word.
See Also
Feature: Text Segment Boundaries
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Continuation Escape
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Alternatives
Main article | Reference | Back to top | Improve this section: 1, 2
An Alternative represents two or more branches in a pattern. If first branch of a pattern fails to match, each alternative is attempted from left to right until a match is found.
Syntax
…|…
— Matches the pattern to the left of the|
. If that fails, matches the pattern to the right of|
.
Feature: Wildcard
Main article | Reference | Back to top | Improve this section: 1, 2
A Wildcard matches a single, non-newline character.
Syntax
.
— Matches any character except newline characters. If thes
(single-line) flag is set then this matches any character.
Feature: Character Classes
Main article | Reference | Back to top | Improve this section: 1, 2
A Character Class is an Atom that specifies a set of characters to match a single character in the set.
Syntax
[…]
— Where…
is one or more single characters or character class escapes, excluding^
at the start and-
between two entries in the set. Matches a character in the set. Example:[abc]
matchesa
,b
, orc
.[^…]
— Where…
is one or more single characters or character class escapes, excluding-
between two entries in the set. Matches any character not in the set. Example:[^abc]
matchesd
,e
, orf
, etc., but nota
,b
, orc
.[a-z]
— Where a and z are single characters or character escapes. Matches any character in the range between a and z (inclusive). Example:[a-c]
matchesa
,b
, orc
, but notd
.
See Also
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Posix Character Classes
Main article | Reference | Back to top | Improve this section: 1, 2
A Posix Character Class is a member of a Character Class set that specifies a named, pre-defined set of characters.
Syntax
[[:name:]]
— Where name is in a set of predefined names. Matches any character in the set.
See Also
- Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Negated Posix Character Classes
Main article | Reference | Back to top | Improve this section: 1, 2
A Negated Posix Character Class is a member of a Character Class set that specifies a named, pre-defined set of excluded characters.
Syntax
[[:^name:]]
— Where name is in a set of predefined names. Matches any character not in the set.
See Also
- Character Classes
- Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Collating Elements
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Equivalence Classes
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Escapes
Main article | Reference | Back to top | Improve this section: 1, 2
A Character Class Escape is a single character escape that represents an entire character class. They can be used as an element of a Character Class or as an Atom. It is often the case that a lower-case escape character is the inclusive set, while an upper-case variant of the same character excludes that set.
Syntax
\d
— A decimal digit character in the range 0-9. Equivalent to[0-9]
.\D
— Any character not in the range 0-9. Equivalent to[^0-9]
.\w
— Any “word” character. Equivalent to[a-zA-Z0-9_]
.\W
— Any non-“word” character. Equivalent to[^a-zA-Z0-9_]
.\s
— Any whitespace character.\S
— Any non-whitespace character.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Line Endings Escape
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Property Escapes
Main article | Reference | Back to top | Improve this section: 1, 2
A Character Property Escape is an escape sequence used to match a character with a specific character property.
Syntax
\pX
— Where X is a single character. Matches a character that has the property X.\p{name}
— Where name is a predefined property name. Matches a character that has the property name.\PX
— Where X is a single character. Matches a character that does not have the property X.\P{name}
— Where name is a predefined property name. Matches a character that does not have the property name.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Nested Set
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Intersection
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Union
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Subtraction
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Subtraction
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Symmetric Difference
- Character Class Complement
Feature: Character Class Symmetric Difference
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Complement
Feature: Character Class Complement
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
- Character Classes
- Posix Character Classes
- Negated Posix Character Classes
- Collating Elements
- Equivalence Classes
- Character Class Escapes
- Line Endings Escape
- Character Property Escapes
- Character Class Nested Set
- Character Class Intersection
- Character Class Union
- Character Class Subtraction
- Character Class Symmetric Difference
Feature: Quoted Characters
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Quantifiers
Main article | Reference | Back to top | Improve this section: 1, 2
Quantifiers specify repetition of an Atom. By default, quantifiers are “greedy” in that they attempt to match as many instances of the preceding Atom as possible to satisfy the pattern before backtracking.
Syntax
*
— Matches the preceding Atom zero or more times. Example:a*b
matchesb
,ab
,aab
,aaab
, etc.+
— Matches the preceding Atom one or more times. Example:a+b
matchesab
,aab
,aaab
, etc., but notb
.?
— Matches the preceding Atom zero or one times. Example:a?b
matchesb
,ab
.{n}
— Where n is an integer. Matches the preceding Atom exactly n times. Example:a{2}
matchesaa
but nota
oraaa
.{n,}
— Where n is an integer. Matches the preceding Atom at-least n times. Example:a{2,}
matchesaa
,aaa
,aaaa
, etc., but nota
.{n,m}
— Where n and m are integers, and m >= n. Matches the preceding Atom at-least n times and at-most m times. Example:a{2,3}
matchesaa
,aaa
,aaaa
, etc., but nota
oraaaa
.
See Also
Feature: Lazy Quantifiers
Main article | Reference | Back to top | Improve this section: 1, 2
Lazy Quantifiers specify repetition of an Atom, but attempt to match as few instances of the preceding Atom as possible to satisfy the pattern before advancing.
Syntax
*?
— Matches the preceding Atom zero or more times.+?
— Matches the preceding Atom one or more times.??
— Matches the preceding Atom zero or one times.{n}?
— Where n is an integer. Matches the preceding Atom exactly n times.{n,}?
— Where n is an integer. Matches the preceding Atom at-least n times.{n,m}?
— Where n and m are integers, and m >= n. Matches the preceding Atom at-least n times and at-most m times.
See Also
Feature: Possessive Quantifiers
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Capturing Groups
Main article | Reference | Back to top | Improve this section: 1, 2
A Capturing Group is a subexpression that can be treated as an Atom and can be repeated using Quantifiers and referenced using Backreferences by index. A Capturing Group can be captured and returned by the matching algorithm.
Syntax
(…)
— Groups the subexpression as a single Atom. The result is captured and returned by the matching algorithm.
See Also
Feature: Named Capturing Groups
Main article | Reference | Back to top | Improve this section: 1, 2
A Named Capturing Group is a subexpression that can be captured and returned by the matching algorithm. A Named Capturing Group is also an Atom and can be repeated using Quantifiers and referenced using Backreferences by name.
Syntax
(?<name>…)
— Groups the subexpression as a single Atom associated with the provided name. The result is captured and returned by the matching algorithm.(?'name'…)
— Groups the subexpression as a single Atom associated with the provided name. The result is captured and returned by the matching algorithm.
See Also
Feature: Non-Capturing Groups
Main article | Reference | Back to top | Improve this section: 1, 2
A Non-capturing Group is a subexpression that can be treated as an Atom and can be repeated using Quantifiers but cannot be referenced using Backreferences. A Non-capturing Group is not captured by the matching algorithm.
Syntax
(?:…)
— Groups the subexpression as a single Atom.
See Also
Feature: Backreferences
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Comments
Main article | Reference | Back to top | Improve this section: 1, 2
A Comment is a sequence of characters that is ignored by pattern matching and can be used to document a pattern.
Syntax
(?#…)
— The entire expression is removed from the pattern. A comment may not contain other(
or)
characters.
See Also
Feature: Line Comments
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Modifiers
Main article | Reference | Back to top | Improve this section: 1, 2
Modifiers allow you to change the currently active RegExp flags within a subexpression.
Syntax
(?imsx-imsx)
- Sets or unsets (using-
) the specified RegExp flags starting at the current position until the next closing)
or the end of the pattern. Example:(?-i)A(?i)B(?-i)C
matchesABC
,AbC
.(?imsx-imsx:…)
- Sets or unsets (using-
) the specified RegExp flags for the subexpression. Example:(?-i:A(?i:B)C)
matchesABC
,AbC
.
See Also
Feature: Branch Reset
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Lookahead
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Lookbehind
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
See Also
Feature: Non-Backtracking Expressions
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Recursion
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Conditional Expressions
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Subroutines
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Callouts
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.
Feature: Backtracking Control Verbs
Main article | Back to top | Improve this section: 1, 2
❌ This feature is not supported.