API to define a pattern for searching or manipulating strings.
Widely used for defining constraints on strings such as password and email validation.
Java Regex API provides 1 interface and 3 classes in java.util.regex package.
- MatchResult interface
- Matcher class
- Pattern class
- PatternSyntaxException class
Matcher class
-
Implements the
MatchResult interface -
Regex engine : Used to perform match operations on a character sequence.
-
Following are the important methods:
- boolean matches()
- boolean find()
- boolean find(int start)
- String group()
- int start()
- int end()
- int groupCount()
Pattern class
-
Compiled version of a regular expression.
-
Used for defining pattern for the regex engine.
-
Following are the important methods:
- static Pattern compile(String regex)
- Matcher matcher(CharSequence input)
- static boolean matches(String regex, CharSequence input)
- String[] split(CharSequence input)
- String pattern()
import java.util.regex.*;
public class RegexExample {
public static void main(String args[]){
Pattern p = Pattern.compile(".s");
Matcher m = p.matcher("as");
boolean b1 = m.matches();
boolean b2=Pattern.compile(".s").matcher("as").matches();
boolean b3 = Pattern.matches(".s", "as");
System.out.println(b+" "+b2+" "+b3); // true true true
}
}
RegEx Essentials
| RegEx Character/Symbol | Usage & Meaning |
|---|---|
| ^regex | match at the beginning of the line |
| regex$ | match at the end of the line |
| [abc] | a, b, or c |
| [abc][vz] | can match a or b or c followed by either v or z |
| [^abc] | Any character except a, b, or c (negation) |
| [a-zA-Z] | a through z or A through Z, inclusive (range) |
| X|Z | Finds X or Z. |
| XZ | Finds X directly followed by Z |
| X? | X occurs once or not at all |
| X+ | X occurs once or more times |
| X* | X occurs zero or more times |
| X{n} | X occurs n times only |
| . | Any character (may or may not match terminator) |
| \d | Any digits, short of [0-9] |
| \D | Any non-digit, short for [^0-9] |
| \s | Any whitespace character, short for [\t\n\x0B\f\r] |
| \S | Any non-whitespace character, short for [^\s] |
| \S+ | Several non-whitespace characters |
| \w | Any word character, short for [a-zA-Z_0-9] |
| \W | Any non-word character, short for [^\w] |
| a(?!b) | (Negative look ahead) match “a” if “a” is not followed by “b”. |
The regex is applied on the text from left to right.
