JavaRush /Java Blog /Random EN /JDBC or where it all starts

Level 3

8 August 2023
207 views
0 comments

JDBC or where it all starts

In the modern world without data storage in any way. And the history of working with databases began a long time ago, with the advent of JDBC. I propose to recall something that no modern framework built on top of JDBC can do without. In addition, even working with them at times you may need the opportunity to "return to the roots". I hope the review helps as an introductory word or helps to refresh something in memory.

Introduction

One of the main goals of a programming language is the storage and processing of information. To better understand the workings of data storage, it is worth spending a little time on the theory and architecture of applications. For example, you can read the literature, namely the book " Software Architect's Handbook: Become a successful software architect by implementing effective arch... " by Joseph Ingeno. As said, there is a certain Data Tier or "Data Layer". It includes a place to store data (for example, a SQL database) and tools for working with a data warehouse (for example, JDBC, which will be discussed). Also on the Microsoft website there is an article: " Designing the infrastructure persistence level", which describes the architectural solution for separating an additional layer from the Data Tier - the Persistence Layer. In this case, the Data Tier is the level of storing the data itself, while the Persistence Layer is some level of abstraction for working with data from the storage from the Data Tier level. The "DAO" template or various ORMs can be attributed to the Persistence Layer level. But ORM is a topic for a separate discussion. As you can already understand, Data Tier appeared first. Ever since JDK 1.1, JDBC (Java DataBase Connectivity - connection with databases in Java).This is a standard for the interaction of Java applications with various DBMS, implemented in the form of java.sql and javax.sql packages that are part of Java SE:

This standard is described by the " JSR 221 JDBC 4.1 API " specification. This specification tells us that the JDBC API provides programmatic access to relational databases from programs written in Java. It also talks about the fact that the JDBC API is part of the Java platform and is therefore included in Java SE and Java EE. The JDBC API is represented by two packages: java.sql and javax.sql. Let's get to know them then.

Beginning of work

To understand what the JDBC API is in general, we need a Java application. It is most convenient to use one of the project build systems. For example, let's use Gradle . You can read more about Gradle in a short overview: " A brief introduction to Gradle ". First, let's initialize a new Gradle project. Since the Gradle functionality is implemented through plugins, we need to use the " Gradle Build Init Plugin " for initialization:

gradle init --type java-application

After that, we will open the build script - the build.gradle file , which describes our project and how to work with it. We are interested in the " dependencies " block, which describes dependencies - that is, those libraries / frameworks / api, without which we cannot work and on which we depend. By default we will see something like:

dependencies {
    // This dependency is found on compile classpath of this component and consumers.
    implementation 'com.google.guava:guava:26.0-jre'
    // Use JUnit test framework
    testImplementation 'junit:junit:4.12'
}

Why are we seeing this here? These are the dependencies of our project that Gradle automatically generated for us when the project was created. And also because guava is a separate library that is not bundled with Java SE. JUnit is also not bundled with Java SE. But we have JDBC out of the box, that is, it is part of Java SE. It turns out we have JDBC. Great. What else do we need? There is such a wonderful scheme:

As we can see, and it is logical, the database is an external component, which is not natively in Java SE. This is explained simply - there are a huge number of databases and you can work with any. For example, there are PostgreSQL, Oracle, MySQL, H2. Each of these databases is supplied by a separate company called database vendors or database vendors. Each database is written in some programming language (not necessarily Java). In order to be able to work with the database from a Java application, the database provider writes a special driver, which is its own image adapter. Such JDBC compliant (that is, those that have a JDBC driver) are also called "JDBC-Compliant Database". Here you can draw an analogy with computer devices. For example, notepad has a "Print" button. Every time you press it, the program tells the operating system that the notepad application wants to print. And you have a printer. To teach your operating system to speak in a uniform way with a Canon or HP printer, you will need different drivers. But for you, as a user, nothing will change. You will still press the same button. Same with JDBC. You are executing the same code, it's just that different databases can work "under the hood". I think this is a very clear approach. Each such JDBC driver is some kind of artifact, library, jar file. It is also a dependency for our project. For example, we can choose the database " To teach your operating system to speak in a uniform way with a Canon or HP printer, you will need different drivers. But for you, as a user, nothing will change. You will still press the same button. Same with JDBC. You are executing the same code, it's just that different databases can work "under the hood". I think this is a very clear approach. Each such JDBC driver is some kind of artifact, library, jar file. It is also a dependency for our project. For example, we can choose the database " To teach your operating system to speak in a uniform way with a Canon or HP printer, you will need different drivers. But for you, as a user, nothing will change. You will still press the same button. Same with JDBC. You are executing the same code, it's just that different databases can work "under the hood". I think this is a very clear approach. Each such JDBC driver is some kind of artifact, library, jar file. It is also a dependency for our project. For example, we can choose the database " Each such JDBC driver is some kind of artifact, library, jar file. It is also a dependency for our project. For example, we can choose the database " Each such JDBC driver is some kind of artifact, library, jar file. It is also a dependency for our project. For example, we can choose the database "H2 Database " and then we need to add a dependency like this:

dependencies {
    implementation 'com.h2database:h2:1.4.197'

How to find the dependency and how to describe it is indicated on the official websites of the database vendor or on " Maven Central ". The JDBC driver is not a database, as you understand. But it is only a guide to it. But there is such a thing as " In memory databases ". These are databases that exist in memory for the lifetime of your application. Usually, this is often used for testing or for training purposes. This allows you not to install a separate database server on the machine. Which is very suitable for us to get acquainted with JDBC. So our sandbox is ready and we start.

connection

So, we have a JDBC driver, we have a JDBC API. As we remember, JDBC stands for Java DataBase Connectivity. Therefore, it all starts with Connectivity - the ability to establish a connection. And a connection is a Connection. Let's turn again to the text of the JDBC specification and look at the table of contents. In the chapter " CHAPTER 4 Overview " (overview - overview) refer to the section " 4.1 Establishing a Connection " (connection establishment) it is said that there are two ways to connect to the database:

Via DriverManager
Via DataSource

Let's deal with DriverManager. As mentioned, the DriverManager allows you to connect to the database at the specified URL, and also loads the JDBC Drivers that it found in the CLASSPATH (and before JDBC 4.0, you had to load the driver class yourself). About connection with a DB there is a separate chapter "CHAPTER 9 Connections". We are interested in how to get a connection through the DriverManager, so we are interested in section "9.3 The DriverManager Class". It specifies how we can access the database:

Connection con = DriverManager.getConnection(url, user, passwd);

The parameters can be taken from the site of the database we have chosen. In our case, this is H2 - " H2 Cheat Sheet ". Let's move on to the AppTest class prepared by Gradle. It contains JUnit tests. A JUnit test is a method that is annotated with @Test. Unit tests are not the topic of this review, so we will simply limit ourselves to understanding that these are methods described in a certain way, the purpose of which is to test something. According to the JDBC specification and the H2 site, let's check that we got a connection to the database. Let's write a method for getting a connection:

private Connection getNewConnection() throws SQLException {
	String url = "jdbc:h2:mem:test";
	String user = "sa";
	String passwd = "sa";
	return DriverManager.getConnection(url, user, passwd);
}

Now let's write a test for this method, which will check that the connection is actually established:

@Test
public void shouldGetJdbcConnection() throws SQLException {
	try(Connection connection = getNewConnection()) {
		assertTrue(connection.isValid(1));
		assertFalse(connection.isClosed());
	}
}

This test, when executed, will check that the received connection is valid (correctly created) and that it is not closed. By using the try-with-resources construct , we will release the resources after we no longer need them. This will save us from "sagging" connections and memory leaks. Since any actions with the database require a connection, let's provide for the rest of the test methods marked with @Test at the beginning of the Connection test, which we will release after the test. To do this, we need two annotations: @Before and @After Add a new field to the AppTest class that will store the JDBC connection for tests:

private static Connection connection;

And add new methods:

@Before
public void init() throws SQLException {
	connection = getNewConnection();
}
@After
public void close() throws SQLException {
	connection.close();
}

Now, any test method is guaranteed to have a JDBC connection and doesn't have to create it every time.

Statements

Next, we are interested in Statements or expressions. They are described in the documentation in chapter " CHAPTER 13 Statements ". First, it says that there are several types or kinds of statements:

Statement: SQL statement that contains no parameters
PreparedStatement : Prepared SQL statement containing input parameters
CallableStatement : SQL statement with the ability to get a return value from stored procedures (SQL Stored Procedures).

So, having a connection, we can execute some request within this connection. Therefore, it is logical that we initially get an instance of the SQL expression from Connection. You need to start by creating a table. Let's describe the query for creating a table as a variable of the String type. How to do it? Let's use some tutorial like " sqltutorial.org ", " sqlbolt.com ", " postgresqltutorial.com ", " codecademy.com ". Let's use, for example, an example from the SQL course on khanacademy.org . Let's add a method for executing an expression in the database:

private int executeUpdate(String query) throws SQLException {
	Statement statement = connection.createStatement();
	// Для Insert, Update, Delete
	int result = statement.executeUpdate(query);
	return result;
}

Let's add a method for creating a test table using the previous method:

private void createCustomerTable() throws SQLException {
	String customerTableQuery = "CREATE TABLE customers " +
                "(id INTEGER PRIMARY KEY, name TEXT, age INTEGER)";
	String customerEntryQuery = "INSERT INTO customers " +
                "VALUES (73, 'Brian', 33)";
	executeUpdate(customerTableQuery);
	executeUpdate(customerEntryQuery);
}

Now let's test this:

@Test
public void shouldCreateCustomerTable() throws SQLException {
	createCustomerTable();
	connection.createStatement().execute("SELECT * FROM customers");
}

Now let's execute the request, and even with a parameter:

@Test
public void shouldSelectData() throws SQLException {
 	createCustomerTable();
 	String query = "SELECT * FROM customers WHERE name = ?";
	PreparedStatement statement = connection.prepareStatement(query);
	statement.setString(1, "Brian");
	boolean hasResult = statement.execute();
	assertTrue(hasResult);
}

JDBC does not support named parameters for PreparedStatement, so the parameters themselves are specified by the questions, and by specifying the value we specify the index of the question (starting from 1, not from zero). In the last test, we got true as an indicator of whether there is a result. But how is the query result represented in the JDBC API? And it is presented as ResultSet.

result set

The concept of ResultSet is described in the JDBC API specification in the chapter "CHAPTER 15 Result Sets". First of all, it says that ResultSet provides methods for getting and manipulating the results of executed queries. That is, if the execute method returned true to us, then we can get the ResultSet. Let's move the call to the createCustomerTable() method into the init method, which is marked as @Before. Now let's refine our shouldSelectData test:

@Test
public void shouldSelectData() throws SQLException {
	String query = "SELECT * FROM customers WHERE name = ?";
	PreparedStatement statement = connection.prepareStatement(query);
	statement.setString(1, "Brian");
	boolean hasResult = statement.execute();
	assertTrue(hasResult);
	// Обработаем результат
	ResultSet resultSet = statement.getResultSet();
	resultSet.next();
	int age = resultSet.getInt("age");
	assertEquals(33, age);
}

Here it is worth noting that next is a method that moves the so-called "cursor". The cursor in the ResultSet points to some row. Thus, to read a line, you need to set this same cursor on it. When the cursor is moved, the cursor movement method returns true if the cursor is valid (correct, correct), i.e. points to data. If it returns false, then there is no data, that is, the cursor does not point to data. If we try to get data with an invalid cursor, we will get an error: No data is available It is also interesting that you can update or even insert rows through the ResultSet:

@Test
public void shouldInsertInResultSet() throws SQLException {
	Statement statement = connection.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
	ResultSet resultSet = statement.executeQuery("SELECT * FROM customers");
	resultSet.moveToInsertRow();
	resultSet.updateLong("id", 3L);
	resultSet.updateString("name", "John");
	resultSet.updateInt("age", 18);
	resultSet.insertRow();
	resultSet.moveToCurrentRow();
}

RowSet

JDBC, in addition to ResultSet, introduces such a concept as RowSet. More details can be read here: " JDBC Basics: Using RowSet Objects ". There are various usage variations. For example, the simplest case might look like this:

@Test
public void shouldUseRowSet() throws SQLException {
 	JdbcRowSet jdbcRs = new JdbcRowSetImpl(connection);
 	jdbcRs.setCommand("SELECT * FROM customers");
	jdbcRs.execute();
	jdbcRs.next();
	String name = jdbcRs.getString("name");
	assertEquals("Brian", name);
}

As you can see, RowSet is similar to statement symbiosis (we specified command through it) and executed command. Through it, we control the cursor (by calling the next method) and get data from it. Not only this approach is interesting, but also possible implementations. For example CachedRowSet. It is "disconnected" (that is, it does not use a persistent connection to the database) and requires explicit synchronization with the database:

CachedRowSet jdbcRsCached = new CachedRowSetImpl();
jdbcRsCached.acceptChanges(connection);

For more information, see the tutorial on the Oracle site: " Using CachedRowSetObjects ".

metadata

In addition to queries, a database connection (that is, an instance of the Connection class) provides access to metadata - data about how our database is configured and how it works. But first, let's talk about a few key points: the URL for connecting to our database: "jdbc:h2:mem:test". test is the name of our database. For the JDBC API, this is a directory. And the name will be in uppercase, that is, TEST. Default schema) for H2 - PUBLIC. Now, let's write a test that shows all user tables. Why custom? Because in databases there are not only user tables (those that we ourselves created using create table statements), but also system tables. They are necessary to store system information about the structure of the database. Each database can store such system tables in different ways. For example, in H2 they are stored in the " INFORMATION_SCHEMA " schema. Interestingly, INFORMATION SCHEMA is a common approach, but Oracle went the other way. You can read more here: " INFORMATION_SCHEMA and Oracle ". Let's write a test that receives metadata on user tables:

@Test
public void shoudGetMetadata() throws SQLException {
	// У нас URL = "jdbc:h2:mem:test", где test - название БД
	// Название БД = catalog
	DatabaseMetaData metaData = connection.getMetaData();
	ResultSet result = metaData.getTables("TEST", "PUBLIC", "%", null);
	List<String> tables = new ArrayList<>();
	while(result.next()) {
		tables.add(result.getString(2) + "." + result.getString(3));
	}
	assertTrue(tables.contains("PUBLIC.CUSTOMERS"));
}

Connection pool

Connection pooling is covered in the "Chapter 11 Connection Pooling" section of the JDBC specification. It also provides the main justification for the need for a connection pool. Each Coonection is a physical connection to the database. Its creation and closing is a rather "expensive" work. JDBC only provides a connection pooling API. Therefore, the choice of implementation remains with us. For example, such implementations include HikariCP . Accordingly, we will need to add a pool to our project dependencies:

dependencies {
    implementation 'com.h2database:h2:1.4.197'
    implementation 'com.zaxxer:HikariCP:3.3.1'
    testImplementation 'junit:junit:4.12'
}

Now we need to somehow use this pool. To do this, you need to initialize the data source, aka Datasource:

private DataSource getDatasource() {
	HikariConfig config = new HikariConfig();
	config.setUsername("sa");
	config.setPassword("sa");
	config.setJdbcUrl("jdbc:h2:mem:test");
	DataSource ds = new HikariDataSource(config);
	return ds;
}

And write a test to get a connection from the pool:

@Test
public void shouldGetConnectionFromDataSource() throws SQLException {
	DataSource datasource = getDatasource();
	try (Connection con = datasource.getConnection()) {
		assertTrue(con.isValid(1));
	}
}

Transactions

One of the most interesting things about JDBC is transactions. In the JDBC specification, they are given a chapter called "CHAPTER 10 Transactions". First of all, it is worth understanding what a transaction is. A transaction is a group of logically combined sequential data operations, processed or canceled as a whole. When does a transaction start when using JDBC? As the specification says, this is decided directly by the JDBC Driver. But usually, a new transaction starts when the current SQL statement (SQL statement) requires a transaction and the transaction has not yet been created. When does a transaction end? This is controlled by the auto-commit attribute. If autocommit is enabled, then the transaction will be committed after the SQL statement is "completed". What is "done" depends on the type of SQL statement:

Data Manipulation Language, aka DML (Insert, Update, Delete)
The transaction is completed as soon as the action has completed

ResultSet#close

CallableStatement and Multiple Result Expressions
When all associated ResultSets are closed and all output is received (including number of updates)

This is exactly how the JDBC API behaves. As usual, we will write a test for this:

@Test
public void shouldCommitTransaction() throws SQLException {
	connection.setAutoCommit(false);
	String query = "INSERT INTO customers VALUES (1, 'Max', 20)";
	connection.createStatement().executeUpdate(query);
	connection.commit();
	Statement statement = connection.createStatement();
 	statement.execute("SELECT * FROM customers");
	ResultSet resultSet = statement.getResultSet();
	int count = 0;
	while(resultSet.next()) {
		count++;
	}
	assertEquals(2, count);
}

Everything is simple. But this is so, while we have only one transaction. What to do when there are several? They need to be isolated from each other. Therefore, let's talk about transaction isolation levels and how JDBC handles them.

Isolation levels

Let's open subsection "10.2 Transaction Isolation Levels" of the JDBC specification. Here, before moving on, I want to doubt about such a thing as ACID. ACID describes the requirements for a transactional system.

Atomicity:
No transaction will be partially committed to the system. Either all of its sub-operations will be executed, or none of them will be executed.
Consistency:
Each successful transaction, by definition, commits only valid results.
Isolation:
While a transaction is running, concurrent transactions should not affect its outcome.
Durability:
If the transaction is successfully completed, the changes made to it will not be undone due to some kind of failure.

Speaking about transaction isolation levels, we are talking about the "Isolation" requirement. Isolation is an "expensive" requirement, so in real databases there are modes that do not completely isolate a transaction (isolation levels of Repeatable Read and below). Wikipedia has an excellent explanation of what problems can arise when working with transactions. Read more here: " Concurrency Issues Using Transactions ". Before we write our test, let's slightly change our Gradle Build Script: add a block with properties, that is, with the settings of our project:

ext {
    h2Version = '1.3.176' // 1.4.177
    hikariVersion = '3.3.1'
    junitVersion = '4.12'
}

Next, use this in versions:

dependencies {
    implementation "com.h2database:h2:${h2Version}"
    implementation "com.zaxxer:HikariCP:${hikariVersion}"
    testImplementation "junit:junit:${junitVersion}"
}

You may have noticed that the h2 version has become lower. Later we will see why. So how do you apply isolation levels? Let's see right away a small practical example:

@Test
public void shouldGetReadUncommited() throws SQLException {
	Connection first = getNewConnection();
	assertTrue(first.getMetaData().supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_UNCOMMITTED));
	first.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED);
	first.setAutoCommit(false);
	// Транзакиця на подключение. Поэтому первая транзакция с ReadUncommited вносит изменения
	String insertQuery = "INSERT INTO customers VALUES (5, 'Max', 15)";
	first.createStatement().executeUpdate(insertQuery);
	// Вторая транзакция пытается их увидеть
	int rowCount = 0;
	JdbcRowSet jdbcRs = new JdbcRowSetImpl(getNewConnection());
	jdbcRs.setCommand("SELECT * FROM customers");
	jdbcRs.execute();
	while (jdbcRs.next()) {
		rowCount++;
	}
	assertEquals(2, rowCount);
}

Interestingly, this test may fail on a vendor that does not support TRANSACTION_READ_UNCOMMITTED (eg sqlite or HSQL). And the transaction level may simply not work. Remember we specified the version of the H2 Database driver? If we raise it to h2Version = '1.4.177' and above, then READ UNCOMMITTED will stop working, although we haven't changed the code. This once again proves that the choice of vendor and driver version is not just letters, it will really depend on how your requests will be fulfilled. How to fix this behavior in version 1.4.177 and how it does not work in versions above can be found here: " Support READ UNCOMMITTED isolation level in MVStore mode ".

Outcome

As we can see, JDBC is a powerful tool in the hands of Java for working with databases. I hope this short review will help you to become a starting point or help you refresh something in your memory. Well, for a snack, a few additional materials:

Fire talk: " Transactions: myths, surprises and opportunities " by Martin Kleppmann
Yuri Tkach: " JPA. Transactions "
Yurik Tkach: " JDBC - Java for testers "
Free course on Udemy: " JDBC and MySQL "
" Handling CallableStatement Objects "
IBM Developer: " Java Database Connectivity "
IBM Knowledge Center: " Getting started with JDBC "

#Viacheslav

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION