Navigation: TextEd > Regular expressions >

Subpatterns as subroutines

 

 

 

 

If the syntax for a recursive subpattern call (either by number or by name) is used outside the parentheses to which it refers, it operates like a subroutine in a programming language. The called subpattern may be defined before or after the reference. A numbered reference can be absolute or relative, as in these examples:


  (...(absolute)...)...(?2)...

  (...(relative)...)...(?-1)...

  (...(?+1)...(relative)...

 

An earlier example pointed out that the pattern

 

  (sens|respons)e and \1ibility

 

matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility". If instead the pattern

 

  (sens|respons)e and (?1)ibility

 

is used, it does match "sense and responsibility" as well as the other two strings. Another example is given in the discussion of DEFINE above.

 

All subroutine calls, whether recursive or not, are always treated as atomic groups. That is, once a subroutine has matched some of the subject string, it is never re-entered, even if it contains untried alternatives and there is a subsequent matching failure. Any capturing parentheses that are set during the subroutine call revert to their previous values afterwards.

 

Processing options such as case-independence are fixed when a subpattern is defined, so if it is used as a subroutine, such options cannot be changed for different calls. For example, consider this pattern:

 

  (abc)(?i:(?-1))

 

It matches "abcabc". It does not match "abcABC" because the change of processing option does not affect the called subpattern.

 


 


 


 

Philip Hazel

University Computing Service

Cambridge CB2 3QH, England.

Last updated: 12 November 2013

Copyright © 1997-2013 University of Cambridge.


 


 


 

 

 

 

 

Copyright © 2024 Rickard Johansson