JavaRush /Java Blog /Random EN /Split method in Java: divide a string into parts

Split method in Java: divide a string into parts

Published in the Random EN group
Let's talk about the String split method : what it does and why it is needed. It's easy to guess that it divides the string, but how does it work in practice? Let's take a closer look at how the method works and discuss some non-obvious details, and at the same time find out how many split methods there actually are in the String class. Let's go!

Definition and signature for Java String.split

The split method in Java splits a string into substrings using a delimiter that is specified using a regular expression. Let's give the method signature and begin our dive:

String[] split(String regex)
Two things are clear from the signature:
  1. The method returns an array of strings.
  2. The method takes a regex string as a parameter.
Let's look at each thing separately in terms of the definition given above.
  1. The method returns an array of strings.

    The definition contains the following words: “The split method in Java splits a string into substrings.” These substrings are collected by the method into an array and represent its return value.

  2. The method takes a regex string as a parameter.

    Again, remember the definition: “splits a string into substrings using a delimiter that is specified using a regular expression.” The regex parameter accepted is a regular expression pattern that is applied to the source string and matches the delimiter character (or combination of characters) in the source string.

Split method in Java: divide a string into parts - 1

Split in practice

Now let's get down to business. Let's imagine that we have a string with words. For example, like this:
I love Java
We need to break the string into words. We see that in this line the words are separated from each other by spaces. Space is an ideal candidate for the separator role in this case. This is what the code for solving this problem looks like:

public class Main {
    public static void main(String[] args) {
        String str = "I love Java";
        String[] words = str.split(" ");
        for (String word : words) {
            System.out.println(word);
        }
    }
}
The output of the main method will be the following lines:
I love Java
Let's look at a few more examples of how the split method would work :
Line Delimiter Result of the method
"I love Java" " " (space character) { "I" , "love" , "Java" }
"192.168.0.1:8080" ":" { "192.168.0.1" , "8080" }
"Red, orange, yellow" "," { "Red" , "orange" , "yellow" }
"Red, orange, yellow" ", " { "Red" , "orange" , "yellow" }
Notice the differences between the last two rows in the table above. In the penultimate line, the delimiter is a comma, so the line is split in such a way that some words have leading spaces. In the last line, we used a comma and a space character as a delimiter. Therefore, the resulting array did not contain any lines with leading spaces. This is just a small detail that demonstrates how important it is to carefully select the correct separator.

Leading delimiter

There is one more important nuance. If the source string begins with a delimiter, the first element of the resulting array will be the empty string. In an example, it would look like this: Source string: "I love Java" Delimiter: " " Resulting array: { "" , "I" , "love" , "Java" } But if the source string ends with a delimiter and does not begin, the result will be different: Source string: "I love Java" Separator: " " Resulting array: { "I" , "love" , "Java" } Let's look in the code at variations of the split method with a separator character at the end and/or beginning of the source string :

public class Main {
    public static void main(String[] args) {
        print("I love Java".split(" "));
        print(" I love Java".split(" "));
        print("I love Java ".split(" "));
        print(" I love Java ".split(" "));
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}
The output of the main method will be like this:
[I, love, Java] [, I, love, Java] [I, love, Java] [, I, love, Java]
Note again that when the first character in the source string is a delimiter character, the resulting array will have the empty string as its first element.

Overloaded fellow

The String class has another split method with this signature:

String[] split(String regex, int limit)
This method has an additional limit parameter : it determines the number of times the regex pattern will be applied to the source string. Below are explanations:

limit > 0

limit -1 times is applied . In this case, the length of the array will not exceed the limit value . The last element of the array will be the part of the string following the last delimiter found. Example:

public class Main {
    public static void main(String[] args) {
        print("I love Java".split(" ", 1));
        print("I love Java".split(" ", 2));
        /*
         Output: 
         [I love Java]
         [I, love Java]
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}

limit < 0

The delimiter search pattern is applied to the string as many times as possible. The length of the resulting array can be any. Example:

public class Main {
    public static void main(String[] args) {
        // Notice the space at the end of the line
        print("I love Java ".split(" ", -1));
        print("I love Java ".split(" ", -2));
        print("I love Java ".split(" ", -12));
        /*
         Output:
        [I, love, Java, ]
        [I, love, Java, ]
        [I, love, Java, ]
        
        Note that the last element of the array is
        an empty string, resulting from the space
        at the end of the original string. 
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}

limit 0

As with limit < 0, the delimiter pattern is applied to the string as many times as possible. The resulting array can be of any length. If the last elements are equal to the empty string, they will be discarded in the final array. Example:

public class Main {
    public static void main(String[] args) {
        // Notice the space at the end of the line
        print("I love Java ".split(" ", 0));
        print("I love Java ".split(" ", 0));
        print("I love Java ".split(" ", 0));
        /*
         Output:
        [I, love, Java]
        [I, love, Java]
        [I, love, Java]
        Note the absence of empty strings at the end of the arrays
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}
If we look at the implementation of the split method with one argument, we see that this method calls its overloaded sibling with a second argument of zero:

    public String[] split(String regex) {
        return split(regex, 0);
    }

Various examples

In working practice, it sometimes happens that we have a line compiled according to certain rules. This line can “come” into our program from anywhere:
  • from a third-party service;
  • from a request to our server;
  • from the configuration file;
  • etc.
Usually in such a situation the programmer knows the “rules of the game”. Let's say the programmer knows that he has information about the user, which is stored according to this pattern:
user_id|user_login|user_email
For example, let's take specific values:
135|bender|bender@gmail.com
And now the programmer is faced with the task: to write a method that sends an email to the user. At his disposal is information about the user, recorded in the above format. Well, the subtask that we will continue to analyze is to isolate the email address from general information about the user. This is one example where the split method can be useful. After all, if we look at the template, we understand that in order to extract the user’s email address from all the information, we only need to split the line using the split method . Then the email address will be in the last element of the resulting array. Let's give an example of such a method, which takes a string containing information about the user and returns the user's email. To simplify, let's assume that this string always matches the format we need:

public class Main {
    public static void main(String[] args) {
        String userInfo = "135|bender|bender@gmail.com";
        System.out.println(getUserEmail(userInfo));
        // Output: bender@gmail.com
    }

    static String getUserEmail(String userInfo) {
        String[] data = userInfo.split("\\|");
        return data[2]; // or data[data.length - 1]
    }
}
Note the separator: "\\|" . Since in regular expressions “|” - this is a special character on which certain logic is tied; in order to use it as a regular one (the one that we want to find in the source string), we need to escape this character using two backslashes. Let's look at another example. Let's say we have information about an order, which is written in approximately this format:
item_number_1,item_name_1,item_price_1;item_number_2,item_name_2,item_price_2;...;item_number_n,item_name_n,item_price_n
Well, or let’s take specific values:
1, cucumbers, 20.05; 2, tomatoes, 123.45; 3, hares, 0.50
We are faced with the task of calculating the total cost of the order. Here we will have to use the split method several times. The first step is to split the string through the ";" symbol into its component parts. Then in each such part we will have information about an individual product, which we can process in the future. And then, within each product, we will separate the information using the "," symbol and take from the resulting array an element with a certain index (in which the price is stored), convert it to a numeric form and compile the final cost of the order. Let's write a method that will calculate all this:

public class Main {
    public static void main(String[] args) {
        String orderInfo = "1, cucumbers, 20.05; 2, tomatoes, 123.45; 3, hares, 0.50";
        System.out.println(getTotalOrderAmount(orderInfo));
        // Output: 144.0
    }

    static double getTotalOrderAmount(String orderInfo) {
        double totalAmount = 0d;
        final String[] items = orderInfo.split(";");

        for (String item : items) {
            final String[] itemInfo = item.split(",");
            totalAmount += Double.parseDouble(itemInfo[2]);
        }

        return totalAmount;
    }
}
Try to figure out for yourself how this method works. Based on these examples, we can say that the split method is used when we have some information in string form, from which we need to extract some more specific information.

Results

We looked at the split method of the String class . It is needed to split a string into its component parts using a special delimiter. The method returns an array of strings (the components of a string). Accepts a regular expression that finds the delimiter character(s). We looked at the various subtleties of this method:
  • leading delimiter character;
  • overloaded brother with two arguments.
We also tried to simulate some “real life” situations in which we used the split method to solve albeit fictitious, but quite realistic problems.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION