A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.

The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash (see examples below).

A list of characters enclosed by **[** and **]** matches any single
character in that list; if the first character of the
list is the caret **^** then it matches any character not in
the list. For example, the regular expression
**[0123456789]** matches any single digit, and **[^0123456789]**
matches any single character that is not a digit. A range of ASCII
characters may be specified by giving the first and last
characters, separated by a hyphen (**[q-t]** is the same as
**[qrst]**). Most metacharacters lose their special
meaning inside lists. To include a literal **]** place it
first in the list. Similarly, to include a literal **^**
place it anywhere but first. Finally, to include a literal
**-** place it last.

The period **.** matches any single character. To match a single
period, you must quote it: use **\.** with a backslash in front.
The caret **^** and the dollar sign **$** are metacharacters that
respectively match the empty string at the beginning and
end of a line. The symbols **\<** and **\>** respectively match
the empty string at the beginning and end of a word. The
symbol **\b** matches the empty string at the edge of a word,
and **\B** matches the empty string provided it's not at the
edge of a word.

A regular expression may be followed by one of several repetition operators:

? |
The preceding item is optional and matched at most once. |

* |
The preceding item will be matched zero or more times. |

+ |
The preceding item will be matched one or more times. |

{n} |
The preceding item is matched exactly n times. |

{n,} |
The preceding item is matched n or more times. |

{,m} |
The preceding item is optional and is matched at most m
times. |

{n,m} |
The preceding item is matched at least n times, but not
more than m times. |

Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.

Two regular expressions may be joined by the infix operator
**|**; the resulting regular expression matches any string
matching either subexpression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.

In basic regular expressions the metacharacters **?**, **+**,
**{**, **|**, **(**, and **)** lose their special meaning;
instead use the backslashed versions **\?**, **\+**, **\{**,
**\|**, **\(**, and **\)**.