We continue the study of regular expressions and the java.util.regex API, the previous lesson - Regular expressions in Java. Part 1
Character classes
Character classes are a set of characters enclosed in square brackets.Simple classes
The simplest form of character classes is to enumerate characters within square brackets. For example, the regular expression[bcr]at
matches the words "bat", "cat", or "rat" because it defines a class that accepts the letters "b", "c", or "r" as the first character. Run the program in the first lesson and test with this regular expression: In the examples given, the match only succeeds when the first letter matches the character defined in the character class.
Negation
To find characters other than those listed, use the "^
" metacharacter at the beginning of a character class. This technique is called denial . Only those strings that do not contain the characters defined by the regular expression are matched .
Character ranges
Sometimes it is necessary to define a character class that contains the range of values "a to z" or the digits 1 to 5. To specify ranges, use the " " metacharacter-
between characters, such as [1-5] or [ah]. You can also use different ranges in the same character class, for example, [a-zA-Z] will accept strings containing letters of the alphabet, regardless of case: a - z (uppercase) or A - Z (uppercase). A few examples of negations and ranges:
Associations
You can also use unions to create a character class that combines two or more different character classes. To create a union, simply wrap one inside the other: [0-4[6-8]]. This union creates a single character class for which the numbers 0, 1, 2, 3, 4, 6, 7, and 8 are suitable.intersections
To create a single character class that defines all nested within it, use&&
, for example: [0-9&&[345]]. This expression defines strings that match both nested classes, i.e. numbers 3, 4 and 5.
GO TO FULL VERSION