JavaRush /Java Blog /Random EN /Coffee break #64. How to write clean code. Why Java is be...

Level 41

28 February 2021
45 views
0 comments

Coffee break #64. How to write clean code. Why Java is better than C++ for low latency systems

How to write clean code

Source: Dev.to Writing clean code is like writing poetry. This is poetry that should be concise, understandable and accessible to change. Clean code implies a scalable organization. This means that making changes doesn't cause chaos. The ability to write such code is one of the key qualities of an experienced developer. After several people recommended that I read the book Clean Code, I finally plucked up the courage to read it. It turned out that this is one of those books where the cover completely lives up to the hype around it. The recommendations in the book are clear, specific, practical, and even presented with humor. Today I want to share with you the main takeaways from this book. Coffee break #64. How to write clean code. Why Java is better than C++ for low latency systems - 1

1. The code should not only work, but also be readable

Most of the cost of software is tied to long-term support. Therefore, the code you write must clearly express your intentions. It should be such that new developers joining the team can easily understand what exactly is happening in the code and why. The more understandable the code the author writes, the less time it will take other developers to understand it. This reduces defects and maintenance costs. How to achieve this? Good naming + classes and functions with single responsibility + writing tests.

2. Later means never

Let's be honest: we all promise ourselves sometimes that we'll come back and clean up the code later, but we end up forgetting about it. Don't leave pieces of useless code that are no longer needed. They confuse other developers and have no value. Therefore, when making changes to functionality, always remove old code. If something breaks somewhere, the tests will still show it right away. How to achieve this? Deleting code can be scary, especially in large architectures. So testing is key here. They allow you to remove code with confidence.

3. Features should be small

The first rule of writing functions is that they should be small, up to about 20 lines . The smaller the function and the more focused it is on one task, the easier it is to find a good name for it. As for function arguments, their ideal number is 0. Next comes 1, 2, but you should try to have no more than 3 arguments. How to achieve this? Functions should be written in accordance with the principles of single responsibility and open/closed.

4. Code duplication is bad

Duplication is the enemy of a well-organized system. It's extra work, extra risk, and extra unnecessary complexity. What to do about it? Make sure your code is written according to the DRY principle, isolated and modular.

5. The only good comment is the one you found a way not to write.

“There is nothing more useful than a good comment in the right place. But comments, even in the best case scenario, are a necessary evil.” Comments are intended to compensate for our inability to express our thoughts in code. That is, this is initially an admission of defeat. Yes, we have to use them because we can't always make our intentions clear with code, but that's no reason to celebrate. The thing is, comments often lie. Not always and not on purpose, but too often. The older the comment is and the further away it is from the code it describes, the more likely it is to be incorrect. The reason for this is simple: programmers cannot maintain both the code and all the comments really well. Therefore, very often comments are separated from the code they refer to and become orphan annotations with minimal precision. What to do about it? Descriptive naming methods must be used. When you read the name of a variable, you should immediately understand what it is. Tests are also needed so that other developers understand which functionality is most important.

6. The object reveals behavior, but not data.

A module should not know about the internals of the objects it manipulates. Objects hide their data and reveal their operations. This means that an object should not expose its internal structure through accessor methods. It's not necessary for everyone to see you naked. What to do about it? The scope of variables should be as local as possible so as not to expose more than necessary.

7. Testing

Test code is just as important as what goes into production. Therefore, it must change and grow as the project develops. Tests keep your code flexible, maintainable, and reusable. Without them, any change may result in bugs. Tests allow you to clean your code without fear that something will break. Therefore, maintaining the purity of tests is of great importance. The cleanliness of the tests ensures their readability. Tests are an opportunity to explain to other developers in simple language the intentions of the code author. Therefore, we test only one concept in each test function. This makes the test descriptive, easier to read, and if it fails, it’s easier to track down the reason for it. How to achieve this? One must follow the principles of clean FIRST tests . Tests should be:

Fast. Tests must run quickly. If you have to wait too long for a test to run, you're less likely to run it more often.
Independent / isolated (Independent). Tests should be as isolated and independent of each other as possible.
Repeatable. Tests should be repeatable in any environment—development, staging, and production.
Self-Validating. The result of the test must be a Boolean value. The test must either pass or fail.
Thorough. We should strive to cover all edge cases, all security issues, every use case (use case) and happy path (the most favorable scenario for the code) with tests.

8. Handling errors and exceptions

Each exception you throw should provide enough context to determine the source and location of the error. Typically you have a stack trace of any exception, but a stack trace won't tell you the purpose of the operation that failed. If possible, avoid passing null in your code. If you are tempted to return null from a method, consider throwing an exception instead. Make error handling a separate task that can be viewed independently of the main logic. How to achieve this? Create informative error messages and pass them along with your exceptions. Specify the operation that failed and the type of error.

9. Classes

Classes should be small. But it’s not the lines of code that need to be counted, but the responsibility. Class names are key to describing what they are responsible for. Our systems should consist of many small classes, not a few huge ones. Each such small class must encapsulate a single responsibility. There must be only one specific reason for each class to exist, and each class must "cooperate" with several other classes to achieve the desired behavior of the system. There is rarely a good reason to create a public variable. Weakening encapsulation is always a last resort. In addition, there should be few instance variables. Good software design allows changes to be made without large investments or rework. Narrowing the range of variables makes this task easier. How to achieve this? Separation of concerns is one of the oldest and most important design techniques. Classes should be open for extension, but closed for modification. In an ideal system, we enable new features by extending the system rather than by making changes to existing code.

10. Formatting

Each empty line is a visual cue to help identify that a new, separate concept has begun. Local variables must appear at the top of the function. Instance variables must be declared at the top of the class. Short lines are better than long ones. Usually the limit is 100-120 characters; you shouldn’t make it longer. How to achieve this? Most parameters can be passed to a linter in your CI or text editor. Use these tools to make your code as clean as possible.

Program Development Principles

Use the following techniques and your code will always be clean: Naming variables. Choosing appropriate names (good naming) is critical to making the code readable and therefore maintainable. “You should choose a name for a variable as responsibly as you would for your firstborn.” Choosing good names is often a challenge for developers. This requires good descriptive skills and a shared cultural background. Clean code is code that is read and improved by completely different developers. The name of a variable, function or class should answer all the basic questions: why this entity exists, what and how it is used. If a name requires comment, it means that it does not sufficiently reveal the essence of what it describes. Longer names are more important than shorter ones, and any searchable name is better than a constant. Single-letter names can only be used as local variables inside short methods: the length of the name must match the scope. Method names must be verbs or verb phrases; the class name must not be a verb. Dependencies should be kept to a minimum. It is better to rely on what you control than on what you cannot control. Otherwise these things will control you. Accuracy. Every single piece of code should be in a place where the reader expects to find it. Navigation through the codebase should be intuitive, and the developer's intentions should be clear. Cleaning. Don't leave useless code in the codebase (old and no longer used or created "just in case"). Reduce duplication and create simple abstractions early on. Standardization. When writing code, you should follow the style and practices established for the repository. Self-discipline. As used technologies develop and new ones appear, developers often have a desire to change and improve something in existing code. Don't give in to the hype too quickly: study new stacks thoroughly and only for a specific purpose. Keeping your codebase clean is about more than being polite to your current and future colleagues. It is essential for the program's long-term survival. The cleaner your code, the happier the developers, the better the product, and the longer it will last.

Why Java is better than C++ for low latency systems

Source: StackOverflow As developers, we all know that there are two ways to do things: manually, slowly and annoyingly, or automatically, difficultly and quickly. I could use artificial intelligence to write this article for me. This could save me a lot of time - the AI can generate thousands of articles per second, but my editor would probably not be happy to learn that it would take two years to generate the first article. Coffee break #64. How to write clean code. Why Java is better than C++ for low latency systems - 2

A similar situation arises when developing software systems with low latency. The conventional wisdom is that it would be crazy to use anything other than C++ because everything else has too much latency. But I'm here to convince you of the opposite, counter-intuitive, almost heretical notion: when it comes to achieving low latency in software systems, Java is better. In this article, I want to take a specific example of software that values low latency: trading systems. However, the arguments presented here can be applied to almost any circumstance in which low latency is required or desired. It's just easier to discuss in relation to the area of development in which I have experience. And the truth is that latency is difficult to measure. It all comes down to what you mean by low latency. Let's figure this out now.

Acquired Wisdom

Since C++ is much closer to hardware, most developers will tell you that coding in this language offers a speed advantage. In low latency situations, such as high-speed trading, where milliseconds can make the difference between a viable piece of software and a legacy waste of disk space, C++ is considered the gold standard. At least that's how it used to be. But the reality is that many large banks and brokers now use systems written in Java. And I mean natively written in Java, not written in Java and then interpreted in C++ to reduce latency. These systems are becoming standard even for tier 1 investment banks, despite the fact that they are (supposedly) slower. So what's going on? Yes, C++ may have "low latency" when it comes to executing code, but it's definitely not low latency when it comes to deploying new features or even finding developers who can write it.

(Real) differences between Java and C++

The development time issue is just the beginning when it comes to the differences between Java and C++ in real-world systems. To understand the true value of each language in this context, let's delve a little deeper. First, it's important to remember the real reason C++ is faster than Java in most situations: a C++ pointer is the address of a variable in memory. This means that the software can access individual variables directly and does not have to crawl through computationally intensive tables to look them up. Or at least can be addressed by specifying where they are, because with C++ you often have to explicitly manage the lifetime and ownership of objects. As a result, unless you're really good at writing code (a skill that can take decades to master), C++ will require hours (or weeks) of debugging. And as anyone who has tried to debug a Monte Carlo engine or a PDE test tool will tell you, trying to debug memory access at a fundamental level can be very time-consuming. Just one faulty pointer can easily bring down an entire system, so releasing a new version written in C++ can be truly terrifying. Of course, that's not all. People who enjoy programming in C++ will point out that Java's garbage collector suffers from non-linear latency spikes. This is especially true when working with legacy systems, so sending updates to Java code without breaking client systems can make them so slow that they are unusable. In response, I would like to point out that a lot of work has been done over the last decade to reduce the latency created by the Java GC. LMAX Disruptor, for example, is a low-latency trading platform written in Java, also built as a framework that has “mechanical interaction” with the hardware it runs on and does not require locking. Problems can be further mitigated if you build a system that uses a continuous integration and delivery (CI/CD) process, since CI/CD allows for automated deployment of tested code changes. This is because CI/CD provides an iterative approach to reducing garbage collection latency, where Java can incrementally improve and adapt to specific hardware environments without the resource-intensive process of preparing code for different hardware specifications before shipping it. Since the IDE's Java support is much broader than C++, most frameworks (Eclipse, IntelliJ IDEA) allow you to refactor Java. This means that IDEs can optimize code for low latency performance, although this ability is still limited when working with C++. Even if Java code doesn't quite match C++ in speed, most developers still find it easier to achieve acceptable performance in Java than in C++.

What do we mean by "faster"?

In fact, there is good reason to doubt that C++ is truly "faster" or even has "lower latency" than Java. I realize that I'm getting into some pretty murky waters, and that many developers will begin to question my sanity. But hear me out. Let's imagine this situation: you have two developers - one writes in C++ and the other in Java, and you ask them to write a high-speed trading platform from scratch. As a result, a system written in Java will take longer to complete trade transactions than a system written in C++. However, Java has far fewer instances of undefined behavior than C++. To take just one example, indexing outside of an array is a bug in both Java and C++. If you accidentally do this in C++, you might get a segfault or (more often) you'll just end up with some random number. In Java, going out of bounds always throws an ArrayIndexOutOfBoundsException error . This means that debugging in Java is much easier because errors are usually immediately identified and the location of the error is easier to track down. In addition, at least in my experience, Java is better at recognizing which pieces of code don't need to run and which are critical to the operation of your software. You can, of course, spend days tweaking your C++ code so that it contains absolutely no extraneous code, but in the real world, every piece of software contains some bloat, and Java is better at recognizing it automatically. This means that in the real world, Java is often faster than C++, even by standard latency metrics. And even where this is not the case, the difference in latency between languages is often overwhelmed by other factors that are not large enough to matter even in high-speed trading.

Benefits of Java for Low Latency Systems

All of these factors, in my opinion, make a pretty compelling argument for using Java to write high-speed trading platforms (and low-latency systems in general, more on that in a moment). However, to sway C++ enthusiasts a little, let's look at some additional reasons to use Java:

First, any excess latency that Java introduces into your software will likely be much less than other factors that affect latency—such as internet problems. This means that any (well-written) Java code can easily perform as well as C++ in most trading situations.
Java's shorter development time also means that, in the real world, software written in Java can adapt to hardware changes (or even new trading strategies) more quickly than C++.
If you dig deeper into this, you'll see that even optimizing Java software can be faster (when considered across the entire software) than a similar task in C++.

In other words, you can very well write Java code to reduce latency. You just need to write it as C++, keeping memory management in mind at every stage of development. The advantage of not writing in C++ is that debugging, agile development, and adapting to multiple environments is easier and faster in Java.

conclusions

Unless you're developing low latency trading systems, you're probably wondering if any of the above applies to you. The answer, with very few exceptions, is yes. The debate about how to achieve low latency is neither new nor unique to the world of finance. For this reason, valuable lessons can be learned from it for other situations. In particular, the above argument that Java is "better" because it is more flexible, more fault-tolerant, and ultimately faster to develop and maintain can be applied to many areas of software development. The reasons I (personally) prefer to write low latency systems in Java are the same reasons that have made the language so successful over the past 25 years. Java is easy to write, compile, debug, and learn. This means you can spend less time writing code and more time optimizing it. In practice, this leads to more reliable and faster trading systems. And that's all that matters for high-speed trading.

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION