If the syntax for a recursive subpattern call (either by number or by name) is used outside the parentheses to which it refers, it operates like a subroutine in a programming language. The called subpattern may be defined before or after the reference. A numbered reference can be absolute or relative, as in these examples:
(...(absolute)...)...(?2)...
(...(relative)...)...(?-1)...
(...(?+1)...(relative)...
An earlier example pointed out that the pattern
(sens|respons)e and \1ibility
matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility". If instead the pattern
(sens|respons)e and (?1)ibility
is used, it does match "sense and responsibility" as well as the other two strings. Another example is given in the discussion of DEFINE above.
All subroutine calls, whether recursive or not, are always treated as atomic groups. That is, once a subroutine has matched some of the subject string, it is never re-entered, even if it contains untried alternatives and there is a subsequent matching failure. Any capturing parentheses that are set during the subroutine call revert to their previous values afterwards.
Processing options such as case-independence are fixed when a subpattern is defined, so if it is used as a subroutine, such options cannot be changed for different calls. For example, consider this pattern:
(abc)(?i:(?-1))
It matches "abcabc". It does not match "abcABC" because the change of processing option does not affect the called subpattern.
Philip Hazel
University Computing Service
Cambridge CB2 3QH, England.
Last updated: 12 November 2013
Copyright © 1997-2013 University of Cambridge.
|