Java Regex Cheat Sheet

In Java, Regex

Java Regular Expressions (regex) is a powerful tool for searching, matching, and manipulating text. It is a pattern-matching language that allows you to define a pattern and search for it in a string. Java regex is widely used in web development, data processing, and text analysis.

The syntax of Java regex is based on the Perl programming language. It uses a set of special characters and operators to define patterns. For example, the dot (.) character matches any single character, while the asterisk (*) matches zero or more occurrences of the preceding character.

Java regex provides a wide range of features, including character classes, quantifiers, anchors, and groups. Character classes allow you to match a set of characters, such as digits, letters, or special characters. Quantifiers allow you to specify the number of occurrences of a pattern, such as one or more, zero or one, or a specific number. Anchors allow you to match the beginning or end of a string, or a word boundary. Groups allow you to group patterns together and apply operators to them.

Java regex is implemented in the java.util.regex package, which provides a set of classes and methods for working with regular expressions. The most commonly used classes are Pattern and Matcher. Pattern represents a compiled regular expression, while Matcher is used to match the pattern against a string.

This cheat sheet provides an overview of the most commonly used regex syntax in Java.

Basic Syntax

SyntaxDescription
.Matches any single character except newline
^Matches the beginning of a line
$Matches the end of a line
[]Matches any single character within the brackets
[^]Matches any single character not within the brackets
|Matches either the expression before or after the |
()Groups expressions together

Character Classes

SyntaxDescription
\dMatches any digit
\DMatches any non-digit
\sMatches any whitespace character
\SMatches any non-whitespace character
\wMatches any word character (letter, digit, or underscore)
\WMatches any non-word character

Quantifiers

SyntaxDescription
*Matches zero or more occurrences of the preceding expression
+Matches one or more occurrences of the preceding expression
?Matches zero or one occurrence of the preceding expression
{n}Matches exactly n occurrences of the preceding expression
{n,}Matches n or more occurrences of the preceding expression
{n,m}Matches between n and m occurrences of the preceding expression

Anchors

SyntaxDescription
\bMatches a word boundary
\BMatches a non-word boundary
(?=...)Positive lookahead
(?!...)Negative lookahead
(?<=...)Positive lookbehind
(?<!...)Negative lookbehind

Examples

RegexDescription
\d{3}-\d{2}-\d{4}Matches a social security number in the format ###-##-####
^[A-Z][a-z]*$Matches a string that starts with an uppercase letter followed by zero or more lowercase letters
(\d{3})\d{3}-\d{4}Groups the first three digits of a phone number

References