- match any one of a set of characters , We put the several options in square brackets, select between single options
- with . (dot) match any single character, . can be taken to match any character whatsoever except a ‘newline’
- match several characters in the middle, + sign tells Perl to match one or more of the preceding character – one or more of any character with .+
- match zero or more characters with .*
- ? matches zero or one of the preceding character
- simple \ (backslash) to indicate that the subsequent character is to be regarded as something to match, and not some fancy control character
modifiers: /test/i
- i – case insensitivity
- s – allows match foo on one line and bar on next so that even /./ will match a “newline” character.
- m – allows the ^ $ to match after a new line and before next newline
- g keep track of where in string it left off. G means end of previous match
extract information from part of a match – /alpha(.+)gamma/
- “xxalphazzzgamma”
- “alpha beta gamma delta”
what do the (parentheses) achieve? The answer is simple – everything in parenthesis is put into the Perl variable $1. (If you have a second set of parentheses, the contents of this set go into $2, and so on).
\n newline (line feed)
\w a word character [a-zA-Z0-9_]
\W NOT a word character, that is [^a-zA-Z0-9_]
\s white space (new line, carriage return, space, tab, form feed)
\S NOT white space
\d a digit [0-9]
\D NOT a digit, i.e. [^0-9]
- \b Match a word boundary
- \B Match a non-(word boundary)
- \A Match only at beginning of string
- \Z Match only at end of string, or before newline at the end
- \z Match only at end of string
- \G Match only where previous m//g left off (works only with /g)