JavaRush /Java Blog /Random EN /substring(..) haunted me
IgorBrest
Level 33

substring(..) haunted me

Published in the Random EN group
Actually, for the reason of the subject, I allowed myself to delve into String.substring(..). And I probably came to unexpected results, which I decided to share with you, dear Javorashovites. Present it to your judgment, so to speak. So here it is. There is a statement that a string created using the substring(..) method uses the character array of the original string. Here, in particular, is an excerpt from the recently read article “Java Reference. Static Strings” by the respected articles :
There is a note about the substring method - the returned string uses the same byte array as the original one
And of course, Javarash Lectures. Here are quotes from Dec 22:
When we create a substring using the substring method, a new String object is created. But instead of storing a reference to an array with a new set of characters, this object stores a reference to the old character array and at the same time stores two variables with which it determines which part of the original character array belongs to it. ... When a substring is created, the character array is not copied to a new String object. Instead, both objects store a reference to the same character array. But! The second object stores two more variables, which contains which and how many characters of this array are written to it. ... Therefore, if you take a string 10,000 characters long and make 10,000 substrings of any length from it, then these “substrings” will take up very little memory, because the character array is not duplicated. Strings that should take up a ton of space will only take up a couple of bytes.
everything is clearly written, even chewed. But, since I’m trying to improve my knowledge of English, I often turn to the official documentation, and somehow I couldn’t find confirmation of this fact... Attributing this to my carelessness, I still looked at the source code of substring() (thanks to IDEA allows you to do this with one click of a button). public String substring(int beginIndex, int endIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex > value.length) { throw new StringIndexOutOfBoundsException(endIndex); } int subLen = endIndex - beginIndex; if (subLen < 0) { throw new StringIndexOutOfBoundsException(subLen); } return ((beginIndex == 0) && (endIndex == value.length)) ? this : new String(value, beginIndex, subLen); } Intrigued, I went further: * Allocates a new {@code String} that contains characters from a subarray * of the character array argument. The {@code offset} argument is the * index of the first character of the subarray and the {@code count} * argument specifies the length of the subarray. The contents of the * subarray are copied; subsequent modification of the character array does * not affect the newly created string. public String(char value[], int offset, int count) { if (offset < 0) { throw new StringIndexOutOfBoundsException(offset); } if (count < 0) { throw new StringIndexOutOfBoundsException(count); } // Note: offset or count might be near -1>>>1. if (offset > value.length - count) { throw new StringIndexOutOfBoundsException(offset + count); } this.value = Arrays.copyOfRange(value, offset, offset+count); } where Arrays.copyOfRange is a native method that returns a copy of an array from char... Quite trivial code, and it seemed obvious to me that a new row with a new set of chars is simply created. or I didn’t take something into account... So, not fully believing in my conclusions, I decided to somehow test this substring(), relying on a phrase from the lecture:
Therefore, if you take a string 10,000 characters long and make 10,000 substrings of any length from it, then these “substrings” will take up very little memory...
only instead of 10_000 we will immediately make 100_000_000, why waste time on trifles. I quickly threw in the following code: and this is what happened: i.e. Every time you create a new sub string using bigString.substring(..), the character array is DUPLICATED. How else can we explain such an increase in memory consumption? After this, I personally no longer had any doubts regarding the operation of the String.substsring() method. What about you? public class Test { public static void main(String[] args) { System.out.println("Начинаем:"); print(); System.out.println("********************************"); char[]big=new char[100_000_000];//создаем нормальный такой массив int j=0;//и заполняем этот массив всякой ерундой for (int k=0;k list=new ArrayList<>();//здесь будут ссылки на строки, что бы сборщик мусора не удалял //не используемые, по его мнению, строки. System.out.println("************************************"); System.out.println("Теперь будем создавть подстроки с помощью substring(..) и наблюдать," + "что же происходит с памятью"); for (int i = 2; i <10; i++) { //создаем подстроку, используя метод String.substring(..) String sub= bigString.substring(1,bigString.length()-1); //если этот метод не создает fully новый массив символов, а только пользуется //исходным из bigString // то при создании новой строки sub мы не будем наблюдать ощутипый расход памяти list.add(sub);//эти ссылки мы должны где нибудь хранить, иначе сборщик мусора //избавится от неипользуемых объктов String System.out.print(String.format("Создаем %d-ую подстроку, при этом ", i - 1)); print(); } System.out.println("***************************************"); print(); } static void print(){ System.out.println("Памяти используется "+(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/1024/1024 + " mb"); } } Начинаем: Памяти используется 0 mb ******************************** создал большую строку bigString на основе массива big. Теперь: Памяти используется 382 mb ************************************ Теперь будем создавть подстроки с помощью substring(..) и наблюдать,что же происходит с памятью Добавляем 1-ую подстроку, при этом Памяти используется 573 mb Добавляем 2-ую подстроку, при этом Памяти используется 763 mb Добавляем 3-ую подстроку, при этом Памяти используется 954 mb Добавляем 4-ую подстроку, при этом Памяти используется 1145 mb Добавляем 5-ую подстроку, при этом Памяти используется 1336 mb Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3658) at java.lang.String. (String.java:201) at java.lang.String.substring(String.java:1956) at com.javarush.test.tests.Test.main(Test.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) Process finished with exit code 1
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION