FindBugs helps you learn Java better

Static code analyzers are popular because they help find errors made due to carelessness. But what is much more interesting is that they help correct mistakes made out of ignorance. Even if everything is written in the official documentation for the language, it is not a fact that all programmers have read it carefully. And programmers can understand: you’ll be tired of reading all the documentation. In this regard, a static analyzer is like an experienced friend who sits next to you and watches you write code. He not only tells you: “This is where you made a mistake when you copied and pasted,” but also says: “No, you can’t write like that, look at the documentation yourself.” Such a friend is more useful than the documentation itself, because he suggests only those things that you actually encounter in your work, and is silent about those that will never be useful to you. In this post, I'll talk about some of the Java intricacies that I learned from using the FindBugs static analyzer. Perhaps some things will be unexpected for you too. It is important that all examples are not speculative, but are based on real code.

Ternary operator ?:

It would seem that there is nothing simpler than the ternary operator, but it has its pitfalls. I believed that there was no fundamental difference between the designs Type var = condition ? valTrue : valFalse; and Type var; if(condition) var = valTrue; else var = valFalse; it turned out that there was a subtlety here. Since the ternary operator can be part of a complex expression, its result must be a concrete type determined at compile time. Therefore, say, with a true condition in the if form, the compiler leads valTrue directly to the type Type, and in the form of a ternary operator, it first leads to the common type valTrue and valFalse (despite the fact that valFalse is not evaluated), and then the result leads to the type Type. The casting rules are not entirely trivial if the expression involves primitive types and wrappers over them (Integer, Double, etc.) All rules are described in detail in JLS 15.25. Let's look at some examples. Number n = flag ? new Integer(1) : new Double(2.0); What will happen to n if flag is set? A Double object with a value of 1.0. The compiler finds our clumsy attempts to create an object funny. Since the second and third arguments are wrappers over different primitive types, the compiler unwraps them and results in a more precise type (in this case, double). And after executing the ternary operator for the assignment, boxing is performed again. Essentially the code is equivalent to this:

Number n; if( flag ) n = Double.valueOf((double) ( new Integer(1).intValue() )); else n = Double.valueOf(new Double(2.0).doubleValue());

From the compiler's point of view, the code contains no problems and compiles perfectly. But FindBugs gives a warning:

BX_UNBOXED_AND_COERCED_FOR_TERNARY_OPERATOR: Primitive value is unboxed and coerced for ternary operator in TestTernary.main(String[]) A wrapped primitive value is unboxed and converted to another primitive type as part of the evaluation of a conditional ternary operator (the b? e1: e2 operator ). The semantics of Java mandate that if e1 and e2 are wrapped numeric values, the values are unboxed and converted/coerced to their common type (eg, if e1 is of type Integer and e2 is of type Float, then e1 is unboxed, converted to a floating point value, and boxed. See JLS Section 15.25. Of course, FindBugs also warns that Integer.valueOf(1) is more efficient than new Integer(1), but everyone already knows that.

Or this example: Integer n = flag ? 1 : null; The author wants to put null in n if the flag is not set. Do you think it will work? Yes. But let's complicate things: Integer n = flag1 ? 1 : flag2 ? 2 : null; It would seem that there is not much difference. However, now if both flags are clear, this line throws a NullPointerException. The options for the right ternary operator are int and null, so the result type is Integer. The options for the left one are int and Integer, so according to Java rules the result is int. To do this, you need to perform unboxing by calling intValue, which throws an exception. The code is equivalent to this:

Integer n; if( flag1 ) n = Integer.valueOf(1); else { if( flag2 ) n = Integer.valueOf(Integer.valueOf(2).intValue()); else n = Integer.valueOf(((Integer)null).intValue()); }

Here FindBugs produces two messages, which are enough to suspect an error:

BX_UNBOXING_IMMEDIATELY_REBOXED: Boxed value is unboxed and then immediately reboxed in TestTernary.main(String[]) NP_NULL_ON_SOME_PATH: Possible null pointer dereference of null in TestTernary.main(String[]) There is a branch of statement that, if executed, guarantees that a null value will be dereferenced, which would generate a NullPointerException when the code is executed.

Well, one last example on this topic:

double[] vals = new double[] {1.0, 2.0, 3.0}; double getVal(int idx) { return (idx < 0 || idx >= vals.length) ? null : vals[idx]; }

It is not surprising that this code does not work: how can a function returning a primitive type return null? Surprisingly, it compiles without problems. Well, you already understand why it compiles.

DateFormat

To format dates and times in Java, it is recommended to use classes that implement the DateFormat interface. For example, it looks like this: public String getDate() { return new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date()); } Often a class will use the same format over and over again. Many people will come up with the idea of optimization: why create a format object every time when you can use a common instance?

private static final DateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); public String getDate() { return format.format(new Date()); }

It’s so beautiful and cool, but unfortunately it doesn’t work. More precisely it works, but occasionally breaks. The fact is that the documentation for DateFormat says:

Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.

And this is true if you look at the internal implementation of SimpleDateFormat. During the execution of the format() method, the object writes to the class fields, so simultaneous use of SimpleDateFormat from two threads will lead to an incorrect result with some probability. Here's what FindBugs writes about this:

STCAL_INVOKE_ON_STATIC_DATE_FORMAT_INSTANCE: Call to method of static java.text.DateFormat in TestDate.getDate() As the JavaDoc states, DateFormats are inherently unsafe for multithreaded use. The detector has found a call to an instance of DateFormat that has been obtained via a static field. This looks suspicious. For more information on this see Sun Bug #6231579 and Sun Bug #6178997.

Pitfalls of BigDecimal

Having learned that the BigDecimal class allows you to store fractional numbers of arbitrary precision, and seeing that it has a constructor for double, some will decide that everything is clear and you can do it like this: System.out.println(new BigDecimal(1.1)); Nobody really forbids doing this, but the result may seem unexpected: 1.100000000000000088817841970012523233890533447265625. This happens because the primitive double is stored in the IEEE754 format, in which it is impossible to represent 1.1 perfectly accurately (in the binary number system, an infinite periodic fraction is obtained). Therefore, the value closest to 1.1 is stored there. On the contrary, the BigDecimal(double) constructor works exactly: it perfectly converts a given number in IEEE754 to decimal form (a final binary fraction is always representable as a final decimal). If you want to represent exactly 1.1 as a BigDecimal, then you can write either new BigDecimal("1.1") or BigDecimal.valueOf(1.1). If you don’t display the number right away, but do some operations with it, you may not understand where the error comes from. FindBugs issues a warning DMI_BIGDECIMAL_CONSTRUCTED_FROM_DOUBLE, which gives the same advice. Here's another thing: BigDecimal d1 = new BigDecimal("1.1"); BigDecimal d2 = new BigDecimal("1.10"); System.out.println(d1.equals(d2)); In fact, d1 and d2 represent the same number, but equals returns false because it compares not only the value of the numbers, but also the current order (the number of decimal places). This is written in the documentation, but few people will read the documentation for such a familiar method as equals. Such a problem may not arise immediately. FindBugs itself, unfortunately, does not warn about this, but there is a popular extension for it - fb-contrib, which takes this bug into account:

MDM_BIGDECIMAL_EQUALS equals() being called to compare two java.math.BigDecimal numbers. This is normally a mistake, as two BigDecimal objects are only equal if they are equal in both value and scale, so that 2.0 is not equal to 2.00. To compare BigDecimal objects for mathematical equality, use compareTo() instead.

Line breaks and printf

Often programmers who switch to Java after C are happy to discover PrintStream.printf (as well as PrintWriter.printf , etc.). Like, great, I know that, just like in C, you don’t need to learn anything new. There are actually differences. One of them lies in line translations. The C language has a division into text and binary streams. Outputting the '\n' character to a text stream by any means will automatically be converted to a system-dependent newline ("\r\n" on Windows). There is no such separation in Java: the correct sequence of characters must be passed to the output stream. This is done automatically, for example, by methods of the PrintStream.println family. But when using printf, passing '\n' in the format string is just '\n', not a system-dependent newline. For example, let's write the following code: System.out.printf("%s\n", "str#1"); System.out.println("str#2"); Having redirected the result to a file, we will see: FindBugs helps you learn Java better - 1

FindBugs helps you learn Java better - 1

Thus, you can get a strange combination of line breaks in one thread, which looks sloppy and can blow the minds of some parser. The error may go unnoticed for a long time, especially if you primarily work on Unix systems. To insert a valid newline using printf, a special formatting character "%n" is used. Here's what FindBugs writes about this:

VA_FORMAT_STRING_USES_NEWLINE: Format string should use %n rather than \n in TestNewline.main(String[]) This format string include a newline character (\n). In format strings, it is generally preferable better to use %n, which will produce the platform-specific line separator.

Perhaps, for some readers, all of the above was known for a long time. But I am almost sure that for them there will be an interesting warning from the static analyzer, which will reveal to them new features of the programming language used.

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION