an atom is one of:
(re) report. (?:re) no report
() empty string,report (?:) empty string, no report
[chars] bracket expression . single character
\k non-alnum k as ordinary \c alphanum c escaped
{ char. w/num, bound. x just a char (default)
a single quantifier may follow an atom. it is: regular quantifiers "bounds"
* (0 or more) {m} (exactly m)
+ (1 or more) {m,} (m or more)
? (0 or 1) {m,n} (m-n inclusive, m<n)
any quantifier followed by an extra ? is non-greedy
a constraint matches empty string under certain conditions. ^ line start (?=re) positive lookahead - point where re begins $ line end (?!re) negative lookahead - where no re beginsshortcuts
\A string start \Z string end \m word start \M word end \y word beginning/end \Y not word beginning/end
[abc] 1 char in set
[^abc] 1 char not in set
[a-z] range (inclusive)
rules inside them:
chars except -, ], escapes, & some [... combos are literals.
literal - or ] use collating elements or \ in front
[.ba.] collating element
[[:<:]], [[:>:]] empty strings at start/end of word (alnum or _)
alpha,upper,lower -> letters (x)digit -> (hexa)decimal alnum,print -> letter or digit punct -> punctuation. space -> white space blank -> space or tab graph -> visible cntrl -> control charshortcuts
\d [[:digit:]] \s [[:space:]] outer brackets lost \w [[:alnum:]_] \D [^[:digit:]] in bracket expressns \S [^[:space:]] \W [^[:alnum:]_] [...^...] is illegal
([bc])\1 match bb or cc but not bc. # by leading '('
\m, \mnn m nonzero digit, n is digit, mnn <= # capturing ')' seen
(?xyz) affects rest of RE after ')'
b,e,q rest of RE is BRE,ERE,literal chars
c,i case-sensitive, case-insensitive
s,n,w,p,w (non), yes, inverse partial, and partial
newline-sensitive (m historical synonym for n)
t,x (tight), expanded syntax:
(ws and chars between # and next \n or RE end ignored)
ws=space retain ws or # when preceded by '\' or in bracket expr.
they are illegal in multichar symbols like '(?:' or '\('
exp - regular expression matched against string.
varName - matching part of string copied (with possible subst) into it
subSpec - replaces matching part of string.
& and \0 -> matching part of string.
\n (n is digit 1..9) -> match for n-th () subexpr of exp
can escape with backslashes, but then enclose subSpec in braces.
-all - all matching ranges found and substituted using corresponding matches
-nocase - matching case-insensitive, substitution uses original case.
-start index --> ^ no longer match startline and \A still match start of string at index.