PCRE Branch Reset Operator
In Perl Compatible Regular Expressions (PCRE) and other dialects you sometimes end up with a flurry of capture groups, and if you want to use alternatives/OR/branching then you’ll end up with empty matches because of the enumeration of matches will match the capture groups regardless of branching.
So a pattern like /([a-z])|([0-9])/
will result in a match array like:
# Matching on '5'
[
0: '',
1: '5'
]
# Matching on 'e'
[
0: 'e',
1: ''
]
simply due to the fact that ([a-z])
constitutes the first matching group and ([0-9])
constitutes the second one regardless of branching.
However, using the branch reset operator you can effectively exclude the unvisited branches like so: /(?|([a-z])|([0-9]))/
and the resulting matches will be:
# Matching on '5'
[
0: '5'
]
# Matching on 'e'
[
0: 'e'
]
I’m aware that the example is totally contrived, I’m merely trying to keep it simple in an effort to make the concept more approachable.