Something went wrong! This article was written as a test assignment for a position in the JavaRush team. And it was written as a full-fledged lecture. Due to this, I guarantee you the quality and quantity of useful knowledge accumulated in this post. In addition to practical and theoretical information, the article contains interesting facts that you might not even know about!
Hello World!
Character escaping is a very interesting and necessary technical solution. The need for character escaping has played an important role in the history of the entire programming industry. In this article, we will talk about what character escaping is, why there is a need to escape them, and how character escaping is implemented in Java. The article will provide examples and interesting facts related to the topic of character escaping. Enjoy reading! All information in a computer system is represented in the form of text, which at a lower level is represented by bytes. When we write a letter or message, we type text that will be understandable to a person. When we write code in the IDE, we type text that the compiler can parse. In Java, text can be represented as a type
String
whose data is represented by control characters - paired quotes.
String str = "Hello World!";
With the text “Hello World!” no problems arise, but what if the same text needs to be highlighted in direct speech? Using the rules of grammar, it becomes clear that the text “Hello World!”, in addition to control characters from the type
String
, must be placed in direct speech quotes.
String str = "Java said, "Hello World!"";
This option will not work, because the compiler simply will not understand at what point the initialization of the variable ends
str
. To solve this and similar problems, it was invented
to escape characters , that is, change
control characters to so-called control sequences, also known as
escape sequences . Below is a list of valid java escape sequences for use in strings.
\t
— Tab character (in java – the equivalent of four spaces);
\b
— A return character in the text one step back or deleting one character in a line (backspace);
\n
— New line symbol;
\r
— Carriage return symbol;
\f
— Skip the page to the beginning of the next page;
\'
— Single quote character;
\"
— Double quote character;
\\
— The backslash character (
\
). Now let's highlight the direct speech in our phrase so that the compiler can easily parse what is written.
String str = "Java said, \"Hello World!\"";
Thus, the written text is understandable to both the compiler and a person if the contents of the variable
str
are displayed on the screen. We figured out what character escaping is and why it is needed. And they even escaped the double quote character! Let's proceed to parsing the remaining escape sequences.
The tab character in a line is indicated by an escape sequence
\t
and is analogous to four spaces. However, if the length of a string consisting of four spaces is equal to the length of four characters, then the length of a string with a tab character will be equal to one. The tab character is often used to construct tables or
pseudo-graphic interface elements, because... This is more convenient than writing four spaces. Below is an example of a pseudo-graphical interface.
Among all the escape sequences, the symbol is
\b
perhaps the most interesting, because it allows us to delete the last character in the output line, similar to if we erased it by pressing the
backspace key .
System.out.print("2 + 2 = 5");
System.out.print("\b");
System.out.print("4");
Symbols
\n
have
\r
a common history - let's look at them together. You may have encountered the line break character
\n
before. For example, if a method
println()
outputs information so that the next output will be on a new line, then the method
print()
does not perform a line break after the output, but if you add a character to the end of the output
\n
, then a line break will be performed.
System.out.print("Next output will be on a new line\n");
System.out.println("Next output will be on a new line");
The carriage return character
\r
allows us to return the cursor to the beginning of the output line and display new information as if there was nothing previously on that line.
System.out.print("Text to be rewritten.");
System.out.print('\r');
System.out.print("New text.");
In fact, carriage return dates back to the days when text was printed on typewriters. To perform a line feed, it was necessary to move the carriage and lower the lever (part of the typewriter mechanism), after which the line feed would be performed. If the lever was not lowered, then one could continue to print on the same line. This is what we observe when we display the symbol
\r
. In this regard, when the programmer wanted to perform a line break, he, out of habit, executed a sequence of characters at the end of the output
\r\n
. As the era of the typewriter came to an end, there was a generation of programmers who still used this sequence, although they had never worked at a typewriter themselves. They often forgot in which order they needed to complete a given sequence -
\r\n
or
\n\r
. Then a test word came to their aid
return
, where the order in which these symbols were displayed was clearly visible. However, later, when developing software for the first versions of Windows, after MS-DOS, programmers were forced to use the sequence
\r\n
. Now you don’t have to worry about this and use only the character to break a line
\n
.
Let's go back in time again, around the 80s. It was then that the symbol for page forward
\f
to the beginning of the next page became popular. At that time, there were large line printers, to work with which it was necessary to write program code containing what and how the printer should print. And to indicate that the text must be started from a new page, the symbol was used
\f
. In our time, this symbol has long lost its relevance, and it is unlikely that you will ever come across it. The dimensions of the linear printer are quite impressive.
With symbols
\’
and
\\
everything is exactly the same as with escaping a double quote, there was an example at the beginning of the article. You will have to escape a single quote, for example, to initialize the char type with a single quote.
char ch = '\'';
Escape the backslash character to indicate that the following character will not be part of the escape sequence.
System.out.println("\\n - line break escape sequence");
In practice, you often have to escape backslashes when working with paths:
System.out.println("It's Java string: \"C:\\Program Files\\Java\\jdk1.7.0\\bin\"");
I emphasized that these escape sequences are used in strings (string literals), because the rest of them is used to describe class regular expressions
Pattern
and is not relevant to the topic of this article.
Here you can see a list of all escape sequences for the class
Pattern
. However, it is worth noting that regular expressions in the form in which they exist now cannot be imagined without the use of escape sequences, not only in java, but also in other popular programming languages, for example, PHP. In java, character escaping is also used in string formatting. For example, when specifying a string format for displaying the percent symbol, you must duplicate the percent symbol -
%%
, otherwise we will get an error, and the IDE will prompt you to add the percent.
System.out.printf("Milk fat percentage : %d%%", 10);
This concludes the article. I hope you learned a lot about character escaping and how to put it into practice. Character escaping is inherent in many programming languages. In java, as in other C-like languages, this technology is implemented almost identically. Therefore, the knowledge you gain from this article may well be useful not only in java. Thank you for your attention and good luck with your studies!