JavaRush /Java Blog /Random EN /split method in java: split string into parts

split method in java: split string into parts

Published in the Random EN group
Let's talk about the String split method : what it does and why it's needed. It's easy to guess that it splits the string, but how does it work in practice? Let's take a closer look at how the method works and discuss some non-obvious details, and at the same time find out how many split methods there really are in the String class. Let's go!

Definition and signature for Java String.split

The split method in Java splits a string into substrings using a delimiter that is defined using a regular expression. Let's give the method signature and start our dive:
String[] split(String regex)
Two things are clear from the signature:
  1. The method returns an array of strings.
  2. The method takes a regex string as a parameter.
Let's analyze each thing separately in the context of the definition given above.
  1. The method returns an array of strings.

    The definition contains these words: "The split method in Java splits a string into substrings." The substring data is collected by the method into an array and represents its return value.

  2. The method takes a regex string as a parameter.

    Again, remember the definition: "splits a string into substrings using a delimiter that is defined using a regular expression." The regex parameter that is received is a regular expression pattern that is applied to the source string and matches the delimiter character (or combination of characters) in the string.

split method in java: split string into parts - 1

Split in practice

Now closer to the point. Imagine that we have a string with words. For example, like this:
I love Java
We need to split the string into words. We see that in this line the words are separated from each other by spaces. A space is the ideal candidate for a delimiter in this case. This is how the code for solving this problem looks like:
public class Main {
    public static void main(String[] args) {
        String str = "I love Java";
        String[] words = str.split(" ");
        for (String word : words) {
            System.out.println(word);
        }
    }
}
The output of the main method will be the following lines:
I love Java
Let's look at a few more examples of how the split method would work :
Line Delimiter The result of the method
"I love Java" " " (space character) { "I" , "love" , "Java" }
"192.168.0.1:8080" ":" { "192.168.0.1" , "8080" }
"Red, Orange, Yellow" "," { "red" , "orange" , "yellow" }
"Red, Orange, Yellow" ", " { "Red" , "orange" , "yellow" }
Notice the differences between the last two rows in the table above. The penultimate line has a comma as the delimiter, so the line is split in such a way that some words have leading spaces. In the last line, we used a comma and a space character as a separator. Therefore, the resulting array did not contain strings with leading spaces. This is just a small detail that demonstrates how important it is to carefully select the correct separator.

Leading delimiter

There is another important nuance. If the source string starts with a delimiter, the first element of the resulting array will be the empty string. In an example, it would look like this: Source string: " I love Java" Separator: " " Resulting array: { "" , "I" , "love" , "Java" } But if the source string ends with a separator, and does not begin, the result will be different: Source string: "I love Java " Separator: " " Resulting array: { "I" , "love" , "Java"with a delimiter character at the end and/or beginning of the original string:
public class Main {
    public static void main(String[] args) {
        print("I love Java".split(" "));
        print(" I love Java".split(" "));
        print("I love Java ".split(" "));
        print(" I love Java ".split(" "));
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}
The output of the main method would be:
[I, love, Java] [, I, love, Java] [I, love, Java] [, I, love, Java]
Note again that when the first character in the source string is the delimiter character, the result is that the first element in the array is the empty string.

Overloaded fellow

The String class has another split method with the following signature:
String[] split(String regex, int limit)
This method has an additional limit parameter : it determines how many times the regex pattern will be applied to the source string. Below are the explanations:

limit > 0

Apply limit -1 times. In this case, the length of the array will not exceed the limit value . The last element of the array will be the part of the string following the last delimiter found. Example:
public class Main {
    public static void main(String[] args) {
        print("I love Java".split(" ", 1));
        print("I love Java".split(" ", 2));
        /*
         Output:
         [I love Java]
         [I, love Java]
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}

limit < 0

The delimiter pattern is applied to the string as many times as possible. The length of the resulting array can be any. Example:
public class Main {
    public static void main(String[] args) {
        // Notice the space at the end of the line
        print("I love Java ".split(" ", -1));
        print("I love Java ".split(" ", -2));
        print("I love Java ".split(" ", -12));
        /*
         Output:
        [I, love, Java, ]
        [I, love, Java, ]
        [I, love, Java, ]

        Note that the last element of the array is
        an empty string, resulting from the space
        at the end of the original string.
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}

limit 0

As with limit < 0, the delimiter pattern is applied to the string as many times as possible. The resulting array can be of any length. If the last elements are equal to the empty string, they will be discarded in the resulting array. Example:
public class Main {
    public static void main(String[] args) {
        // Notice the space at the end of the line
        print("I love Java ".split(" ", 0));
        print("I love Java ".split(" ", 0));
        print("I love Java ".split(" ", 0));
        /*
         Output:
        [I, love, Java]
        [I, love, Java]
        [I, love, Java]
        Note the absence of empty strings at the end of the arrays
        */
    }

    static void print(String[] arr) {
        System.out.println(Arrays.toString(arr));
    }
}
If we look at the implementation of the split method with one argument, we see that this method calls its overloaded counterpart with a second argument of zero:
public String[] split(String regex) {
    return split(regex, 0);
}

Various examples

In working practice, it sometimes happens that we have a string composed according to certain rules. This line can "come" into our program from anywhere:
  • from a third party service
  • from a request to our server;
  • from the configuration file;
  • etc.
Usually in such a situation, the programmer knows the "rules of the game". Let's say the programmer knows that he has information about the user, which is stored according to the following pattern:
user_id|user_login|user_email
For example, let's take specific values:
135|bender|bender@gmail.com
And now the programmer is faced with the task: to write a method that sends an email to the user. At his disposal - information about the user, recorded in the above format. Well, the subtask that we will continue to analyze is to isolate the email address from the general information about the user. This is one example where the split method can be useful. After all, if we look at the template, we understand that in order to isolate the user's email address from all the information, we only need to split the string using the split method. Then the email address will be in the last element of the resulting array. Let's give an example of such a method that takes a string containing information about the user and returns the user's email. For simplicity, let's assume that the given string always matches the format we want:
public class Main {
    public static void main(String[] args) {
        String userInfo = "135|bender|bender@gmail.com";
        System.out.println(getUserEmail(userInfo));
        // Output: bender@gmail.com
    }

    static String getUserEmail(String userInfo) {
        String[] data = userInfo.split("\\|");
        return data[2]; // or data[data.length - 1]
    }
}
Pay attention to the separator: "\\|" . Since in regular expressions “|” is a special character, which is tied to a certain logic, in order to use it as a normal one (the one that we want to find in the source string), we need to escape this character using two backslashes. Let's consider another example. Let's say we have information about an order, which is written in the following format:
item_number_1,item_name_1,item_price_1;item_number_2,item_name_2,item_price_2;...;item_number_n,item_name_n,item_price_n
Well, or take specific values:
1, cucumbers, 20.05; 2, tomatoes, 123.45; 3, hares, 0.50
Our task is to calculate the total cost of the order. Here we will have to apply the split method several times. The first step is to split the string through the ";" character into its component parts. Then in each such part we will have information about a separate product, which we can process in the future. And then, within the framework of each product, we will separate the information using the "," symbol and take an element with a certain index (in which the price is stored) from the resulting array, convert it to a numerical form and make up the total cost of the order. Let's write a method that calculates all this:
public class Main {
    public static void main(String[] args) {
        String orderInfo = "1, cucumbers, 20.05; 2, tomatoes, 123.45; 3, hares, 0.50";
        System.out.println(getTotalOrderAmount(orderInfo));
        // Output: 144.0
    }

    static double getTotalOrderAmount(String orderInfo) {
        double totalAmount = 0d;
        final String[] items = orderInfo.split(";");

        for (String item : items) {
            final String[] itemInfo = item.split(",");
            totalAmount += Double.parseDouble(itemInfo[2]);
        }

        return totalAmount;
    }
}
Try it yourself to understand how this method works. Based on these examples, we can say that the split method is used when we have some information in string form, from which we need to isolate some more specific information. split method in java: split string into parts - 2

Results

We have looked at the split method of the String class . It is needed to split a string into its component parts using a special separator. The method returns an array of strings (constituents of a string). Accepts a regular expression that matches the separator character(s). We have considered the various subtleties of the work of this method:
  • leading delimiter character;
  • overloaded brother with two arguments.
We also tried to simulate some “real life” situations in which the split method was used to solve albeit fictitious, but quite realistic problems.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION