Methods for working with captured groups
The application source codeRegexDemo
includes a method call m.group()
. The method group()
is one of several methods of the class Matcher
aimed at working with captured groups:
-
The method
int groupCount()
returns the number of captured groups in the resolver pattern. This number does not take into account the special capture group number 0, which corresponds to the pattern as a whole. -
The method
String group()
returns the characters of the previous match found. To report a successful search for an empty string, this method returns an empty string. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
String group(int group)
is similar to the previous method, except that it returns the characters of the previous match, captured by the group number specified by the parametergroup
. Note that thisgroup(0)
is equivalent togroup()
. If the template does not have a captured group with the given number, the method throws an exceptionIndexOutOfBoundsException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
String group(String name)
returns the characters of the previous match found, captured by the name group. If the captured group name is not in the template, an exception is thrownIllegalArgumentException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
.
groupCount()
and methods group(int group)
:
Pattern p = Pattern.compile("(.(.(.)))");
Matcher m = p.matcher("abc");
m.find();
System.out.println(m.groupCount());
for (int i = 0; i <= m.groupCount(); i++)
System.out.println(i + ": " + m.group(i));
Execution results:
3
0: abc
1: abc
2: bc
3: c

Methods for determining match positions
The classMatcher
provides several methods that return the starting and ending positions of a match:
-
The method
int start()
returns the starting position of the previous match found. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
int start(int group)
is similar to the previous method, but returns the starting position of the previous match found for the group whose number is specified by the parametergroup
. If the template does not have a captured group with the given number, the method throws an exceptionIndexOutOfBoundsException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
int start(String name)
is similar to the previous method, but returns the starting position of the previous match found for the group calledname
. If the captured groupname
is not in the template, an exception is thrownIllegalArgumentException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
int end()
returns the position of the last character of the previous match found plus 1. If the matcher has not yet performed a match or the previous search operation failed, an exception is thrownIllegalStateException
. -
The method
int end(int group)
is similar to the previous method, but returns the ending position of the previous match found for the group whose number is specified by the parametergroup
. If the template does not have a captured group with the given number, the method throws an exceptionIndexOutOfBoundsException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
. -
The method
int end(String name)
is similar to the previous method, but returns the ending position of the previous match found for the group calledname
. If the captured groupname
is not in the template, an exception is thrownIllegalArgumentException
. If the resolver has not yet performed a lookup or a previous lookup operation failed, an exception is thrownIllegalStateException
.
Pattern p = Pattern.compile("(.(.(.)))");
Matcher m = p.matcher("abcabcabc");
while (m.find())
{
System.out.println("Найдено " + m.group(2));
System.out.println(" начинается с позиции " + m.start(2) +
" и заканчивается на позиции " + (m.end(2) - 1));
System.out.println();
}
The output of this example is the following:
Найдено bc
начинается с позиции 1 и заканчивается на позиции 2
Найдено bc
начинается с позиции 4 и заканчивается на позиции 5
Найдено bc
начинается с позиции 7 и заканчивается на позиции 8
Methods of the PatternSyntaxException class
An instance of the classPatternSyntaxException
describes a syntax error in the regular expression. Throws such an exception from the methods compile()
and matches()
class Pattern
, and is formed through the following constructor: PatternSyntaxException(String desc, String regex, int index)
This constructor stores the specified description ( desc
), regular expression ( regex
), and the position at which the syntax error occurred. If the location of the syntax error is unknown, the value index
is set to -1
. Most likely, you will never need to create instances of the PatternSyntaxException
. However, you will need to extract the above values when creating a formatted error message. To do this, you can use the following methods:
- The method
String getDescription()
returns a description of the syntax error. - The method
int getIndex()
returns either the position at which the error occurred, or -1 if the position is unknown. - The method
String getPattern()
returns an invalid regular expression.
String getMessage()
returns a multiline string with the values returned from previous methods along with a visual indication of where the syntax error occurred in the template. What is a syntax error? Here's an example: java RegexDemo (?itree Treehouse
In this case, we forgot to specify the closing parenthesis metacharacter ( )
) in the nested flag expression. This is what is output from this error:
regex = (?itree
input = Treehouse
Неправильное регулярное выражение: Unknown inline modifier near index 3
(?itree
^
Описание: Unknown inline modifier
Позиция: 3
Неправильный шаблон: (?itree
Build Useful Regular Expression Applications Using the Regex API
Regular expressions enable you to create powerful text processing applications. In this section, we'll show you two handy applications that will hopefully encourage you to further explore the Regex API classes and methods. The second appendix introduces Lexan: a reusable code library for performing lexical analysis.
Regular expressions and documentation
Documentation is one of the mandatory tasks when developing professional software. Fortunately, regular expressions can help you with many aspects of documentation creation. The code in Listing 1 extracts lines containing single-line and multiline C-style comments from a source file and writes them to another file. For the code to work, the comments must be on the same line. Listing 1. Retrieving commentsimport java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
public class ExtCmnt
{
public static void main(String[] args)
{
if (args.length != 2)
{
System.err.println("Способ применения: java ExtCmnt infile outfile");
return;
}
Pattern p;
try
{
// Следующий шаблон определяет многострочные комментарии,
// располагающиеся в одной строке (например, /* одна строка */)
// и однострочные комментарии (например, // Howая-то строка).
// Комментарий может располагаться в любом месте строки.
p = Pattern.compile(".*/\\*.*\\*/|.*//.*$");
}
catch (PatternSyntaxException pse)
{
System.err.printf("Синтаксическая ошибка в регулярном выражении: %s%n", pse.getMessage());
System.err.printf("Описание ошибки: %s%n", pse.getDescription());
System.err.printf("Позиция ошибки: %s%n", pse.getIndex());
System.err.printf("Ошибочный шаблон: %s%n", pse.getPattern());
return;
}
try (FileReader fr = new FileReader(args[0]);
BufferedReader br = new BufferedReader(fr);
FileWriter fw = new FileWriter(args[1]);
BufferedWriter bw = new BufferedWriter(fw))
{
Matcher m = p.matcher("");
String line;
while ((line = br.readLine()) != null)
{
m.reset(line);
if (m.matches()) /* Должна соответствовать вся строка */
{
bw.write(line);
bw.newLine();
}
}
}
catch (IOException ioe)
{
System.err.println(ioe.getMessage());
return;
}
}
}
The method main()
in Listing 1 first checks for correct command-line syntax and then compiles a regular expression designed to detect single- and multi-line comments into a class object Pattern
. If no exception is raised PatternSyntaxException
, the method main()
opens the source file, creates the target file, obtains a matcher to match each line read against the pattern, and then reads the source file line by line. For each line, it is matched with a comment pattern. If successful, the method main()
writes the string (followed by a newline) to the target file (we'll cover file I/O logic in a future Java 101 tutorial). Compile Listing 1 as follows: javac ExtCmnt.java
Run the application with file ExtCmnt.java
as input: java ExtCmnt ExtCmnt.java out
You should get the following results in file out:
// Следующий шаблон определяет многострочные комментарии,
// располагающиеся в одной строке (например, /* одна строка */)
// и однострочные комментарии (например, // Howая-то строка).
// Комментарий может располагаться в любом месте строки.
p = Pattern.compile(".*/\\*.*\\*/|.*//.*$");
if (m.matches()) /* Должна соответствовать вся строка */
In the pattern string .*/\\*.*\\*/|.*//.*$
, the pipe metacharacter |
acts as a logical OR operator, indicating that the matcher should use the left operand of the given regular expression construct to find a match in the matcher text. If there are no matches, the matcher uses the right operand from the given regular expression construct for another search attempt (the parenthesis metacharacters in the captured group also form a logical operator). Regular Expressions in Java, Part 5
GO TO FULL VERSION