PCRE Branch Reset Operator
In Perl Compatible Regular Expressions (PCRE) and other dialects you sometimes end up with a flurry of capture groups, and if you want to use alternatives/OR/branching then you’ll end up with empty matches because of the enumeration of matches will match the capture groups regardless of branching.
So a pattern like
/([a-z])|([0-9])/ will result in a match array like:
# Matching on '5' [ 0: '', 1: '5' ] # Matching on 'e' [ 0: 'e', 1: '' ]
simply due to the fact that
([a-z]) constitutes the first matching group and
([0-9]) constitutes the second one regardless of branching.
However, using the branch reset operator you can effectively exclude the unvisited branches like so:
/(?|([a-z])|([0-9]))/ and the resulting matches will be:
# Matching on '5' [ 0: '5' ] # Matching on 'e' [ 0: 'e' ]
I’m aware that the example is totally contrived, I’m merely trying to keep it simple in an effort to make the concept more approachable.