JavaRush /Java Blog /Random EN /Part 2. DBMS structure, tables and data types

Level 41

8 August 2023
188 views
0 comments

Part 2. DBMS structure, tables and data types

First part

We continue to create our simple stock exchange emulator. Here's what we'll do:

Let's create a database organization diagram.
We will describe what, how and where it is stored.
Let's find out how data is related to each other.
Let's start learning the basics of SQL using the example of the SQL table creation command CREATE TABLE , Data Definition Language ( DDL ) of the SQL language.
Let's continue writing the Java program. We implement the main functions of the DBMS in terms of java.sql for creating our database programmatically, using JDBC and a three-tier architecture.

These two parts turned out to be more voluminous, since we need to familiarize ourselves with the basics of SQL and the organization of a DBMS from the inside, and draw analogies with Java. In order not to bore you with code listings, at the end there are links to the corresponding commit github repository with the program.

DBMS design

Application Description

You've already heard that organizing data storage is an integral part of programming. Let me remind you that the purpose of our application is the simplest exchange emulation:

There are shares whose value can change during the trading day according to given rules;
there are traders with initial capital;
traders can buy and sell shares according to their algorithm.

The exchange operates in ticks - fixed periods of time (in our case - 1 minute). During a tick, the stock price may change, and then the trader may buy or sell shares.

Exchange emulation data structure

Let's call individual exchange entities models. To avoid rounding errors, we will work with financial amounts through a class BigDecimal(details can be found in the link at the end of the article). Let's describe the structure of each model in more detail: Promotion:

Attribute	Type	Description
`name`	Srting	Name
`changeProbability`	int	Probability of rate change as a percentage on each tick
`startPrice`	BigDecimal	Initial cost
`delta`	int	The maximum amount in percentage by which the current value can change

Share price:

Attribute	Type	Description
`operDate`	LocalDateTime	Time (tick) for setting the rate
`share`	Promotion	Link to promotion
`rate`	BigDecimal	Share price

Trader:

Attribute	Type	Description
`name`	String	Time (tick) for setting the rate
`sfreqTick`	int	Frequency of transactions. Specified by the period, in ticks, after which the trader performs operations
`cash`	BigDecimal	Amount of money other than shares
`traidingMethod`	int	The algorithm used by the trader. Let's set it as a constant number, the implementation of the algorithm will be (in the following parts) in Java code
`changeProbability`	int	Probability of completing the operation, percentage
`about`	String	Probability of rate change, in percentage, on each tick

Trader actions:

Attribute	Type	Description
`operation`	int	Transaction type (buy or sell)
`traider`	Trader	Trader link
`shareRate`	Share price	Link to the stock price (respectively, the stock itself, its rate and the time it was issued)
`amount`	Long	Number of shares involved in the transaction

To ensure the uniqueness of each model, we will add an attribute idof type long . This attribute will be unique within model instances and will uniquely identify it. Attributes that reference other models (trader, stock, stock price) can use this one idto uniquely identify the corresponding model. The thought immediately comes to mind that we could use Map<Long, Object> to store such data, where Objectis the corresponding model. However, try implementing this in code under the following conditions:

the data size significantly exceeds the amount of available RAM;
access to data is expected from a dozen different places;
the ability to simultaneously modify and read data is required;
it is necessary to ensure rules for the formation and integrity of data;

...and you will be faced with tasks that require proper qualifications and time to implement. There is no need to “reinvent the wheel”. Much has already been thought out and written for us. So we will use what has already been tested over the years.

Storing Data in Java

Let's consider the action. In Java, we created a specific class for this model Sharewith fields name, changeProbability, startPrice, delta. And many shares were stored as Map<Long, Share>, where the key is a unique identifier for each share.


public class Share {
    private String name;
    private BigDecimal startPrice;
    private int changeProbability; 
    private int delta;   
}
Map<Long, Share> shares = new HashMap<>();
shares.put(1L, new Share("ibm", BigDecimal.valueOf(20.0), 15, 10));
shares.put(2L, new Share("apple", BigDecimal.valueOf(14.0), 25, 15));
shares.put(3L, new Share("google", BigDecimal.valueOf(12.0), 20, 8));
...
shares.put(50L, new Share("microsoft", BigDecimal.valueOf(17.5), 10,4 ));

To access the desired promotion by ID, use the method shares.get(id). For the task of finding a stock by name or price, we would loop through all the records looking for the one we need, and so on. But we will go the other way and store the values in the DBMS.

Data storage in a DBMS

Let us formulate an initial set of data storage rules for a DBMS:

Data in a DBMS is organized into tables ( TABLE ), which are a set of records.
All records have the same sets of fields. They are set when creating the table.
The field can be set to a default value ( DEFAULT ).
For a table, you can set constraints ( CONSTRAINT ) that describe the requirements for its data to ensure their integrity. This can be done at the table creation stage ( CREATE TABLE ) or added later ( ALTER TABLE ... ADD CONSTRAINT ).
The most common CONSTRAINT :
- The primary key is PRIMARY (Id in our case).
- Unique value field UNIQUE (VIN for the vehicle table).
- Checking the CHECK field (percentage value cannot be greater than 100). One of the private restrictions on a field is NOT NULL or NULL , which prohibits/allows storing NULL in a table field.
- Link to a third-party table FOREIGN KEY (link to a stock in the stock price table).
- Index INDEX (indexing a field to speed up the search for values in it).
- Modification of a record ( INSERT , UPDATE ) will not occur if the values of its fields contradict the restrictions (CONSTRAINT).
Each table can have a key field (or several) that can be used to uniquely identify a record. Such a field (or fields, if they form a composite key) forms the primary key of the table - PRIMARY KEY .
- The primary key ensures the uniqueness of a record in the table; an index is created on it, which gives quick access to the entire record based on the key value.
- Having a primary key makes it much easier to create links between tables. Next, we will use an artificial primary key: for the first record id = 1, each subsequent record will be inserted into the table with the id value increased by one. This key is often called AutoIncrement or AutoIdentity .

Actually, a table of stocks: Part 2. DBMS structure, tables and data types - 2

Is it possible to use the stock name as a key in this case? By and large - yes, but there is a possibility that some company issues different shares and calls them only by its own name. In this case, there will no longer be uniqueness. In practice, an artificial primary key is used quite often. Agree, using a full name as a unique key in a table containing records of people will not ensure uniqueness. As well as using a combination of full name and date of birth.

Data types in DBMS

Like any other programming language, SQL has data typing. Here are the most common SQL data types: Integer types

SQL type	SQL synonyms	Matching in Java	Description
INT	INT4,INTEGER	java.lang.Integer	4-byte integer, -2147483648 … 2147483647
BOOLEAN	BOOL, BIT	java.lang.Boolean	True, False
TINYINT		java.lang.Byte	1-byte integer, -128 … 127
SMALLINT	INT2	java.lang.Short	2-byte integer, -32768 … 32767
BIGINT	INT8	java.lang.Long	8-byte integer, -9223372036854775808 … 9223372036854775807
AUTO_INCREMENT	INCREMENT	java.lang.Long	An incremental counter unique to the table. If a new value is inserted into it, it is increased by one. The generated values are never repeated.

Real

SQL type	SQL synonyms	Matching in Java	Description
DECIMAL(N,M)	DEC, NUMBER	java.math.BigDecimal	Fixed precision decimal (N integer digits and M fractional digits). Mainly designed for working with financial data.
DOUBLE	FLOAT8	java.lang.Double	Double precision real number (8 bytes).
REAL	FLOAT4	java.lang.Real	Single precision real number (4 bytes).

String

SQL type	SQL synonyms	Matching in Java	Description
VARCHAR(N)	NVARCHAR	java.lang.String	UNICODE string of length N. Length limited to 2147483647 Loads the entire contents of the string into memory.

date and time

SQL type	SQL synonyms	Matching in Java	Description
TIME		java.time.LocalTime, java.sql.Time	Storing time (up to nanoseconds), when converting to DATETIME, the date is set to January 1, 1970.
DATE		java.time.LocalDate, java.sql.Timestamp	Storing dates in yyyy-mm-dd format, time is set as 00:00
DATETIME	TIMESTAMP	java.time.LocalDateTime, java.sql.Timestamp	Storing date + time (without taking into account time zones).

Storage of large volumes of data

SQL type	Matching in Java	Description
BLOB	java.io.InputStream, java.sql.Blob	Storing binary data (pictures, files...).
CLOB	java.io.Reader, java.sql.Clob	Storing large text data (books, articles...), unlike VARCHAR, loads data into memory in portions.

SQL writing style

For many languages, there are code formatting guidelines. Typically, such documents contain rules for naming variables, constants, methods and other language structures. So, for Python there is PEP8, for Java - Oracle Code Conventions for Java . Several different sets have been created for SQL, which are slightly different from each other. Regardless, you should develop the habit of following rules when formatting your code, especially if you work in a team. The rules could be, for example, the following (of course, you can develop a different set of rules for yourself, the main thing is to stick to them in the future):

Keywords and reserved words, including commands and operators, must be written in capital letters: CREATE TABLE, CONSTRAINT...
The names of tables, fields and other objects should not coincide with SQL language keywords (see the link at the end of the article), but may contain them.
Table names should reflect their purpose. They are written in lowercase letters. Words in the name are separated from each other by underscores. The word at the end must be in the plural : traders (traders), share_rates (share rate).
Table field names should reflect their purpose. They must be written in lowercase letters, the words in the name must be formatted in Camel Case style , and the word at the end must be used in the singular : name (name), share_rates (share rate).
Artificial key fields must contain the word id.
CONSTRAINT names must follow table naming conventions. They must also include the fields and tables involved in them, begin with a semantic prefix: check_ (checking the field value), pk_ (primary key), fk_ (foreign key), uniq_ (field uniqueness), idx_ (index). Example: pk_traider_share_actions_id (primary key on the id field for the trader_share_actions table).
And so on, as you study SQL, the list of rules will be replenished/changed.

DBMS design

Immediately before creating a DBMS, it needs to be designed. The final schema contains tables, a set of fields, CONSTRAINT, keys, default conditions for fields, relationships between tables and other database entities. On the Internet you can find many free online/offline designers for designing small DBMSs. Try typing something like “Database designer free” into a search engine. Such applications have useful additional properties:

Can generate SQL commands to create a DBMS.
Visually display the settings on the diagram.
Allows you to move tables for better visualization.
Show keys, indexes, relationships, default values, and the like on the diagram.
They can remotely store the DBMS schema.

For example, dbdiffo.com highlights keys, shows non-empty fields and AI (AutoIncrement) counters with the NN label:

Part 2. DBMS structure, tables and data types - 3

Creating tables in a DBMS

So we have a diagram. Now let's move on to creating tables (CREATE TABLE). To do this, it is advisable for us to have preliminary data:

table name
field names and type
restrictions (CONSTRAINTS) on fields
default values for fields (if available)
primary key (PRIMARY KEY) if available
connections between tables (FOREIGN KEY)

We will not study in detail all the options of the CREATE TABLE command; we will look at the basics of SQL using the example of creating a table for traders:


CREATE TABLE traiders( 
	id BIGINT AUTO_INCREMENT PRIMARY KEY,
	name VARCHAR(255) NOT NULL,
	freqTiсk INTEGER NOT NULL,
	cash  DECIMAL(15,2) NOT NULL DEFAULT 1000,
	tradingMethod INTEGER NOT NULL,
	changeProbability INTEGER NOT NULL DEFAULT 50,
	about VARCHAR(255) NULL
);
ALTER TABLE traiders ADD CONSTRAINT check_traiders_tradingMethod 
	CHECK(tradingMethod IN (1,2,3));
ALTER TABLE traiders ADD CONSTRAINT check_traiders_changeProbability 
	CHECK(changeProbability <= 100 AND changeProbability > 0)

Let's take a closer look:

CREATE TABLE traiders(field description) - creates a table with the specified name; in the description, the fields are separated by a comma. Any command ends with a semicolon.
The field description begins with its name, followed by its type, CONSTRAINT, and default value.
id BIGINT AUTO_INCREMENT PRIMARY KEY– the id field of an integer type is a primary key and an incremental counter (for each new record for the id field, a value will be generated that is one more than the previously created one for this table).
cash DECIMAL(15,2) NOT NULL DEFAULT 1000– cash field, decimal, 15 digits before the decimal point and two after (financial data, for example, dollars and cents). Cannot accept NULL values. If no value is given, it will get the value 1000.
about VARCHAR(255) NULL– the about field, a string up to 255 characters long, can accept empty values.

Note that we can set part of the CONSTRAINT conditions after creating the table. Let's consider the construction for modifying the table structure and its fields: ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK (condition) using examples:

CHECK(tradingMethod IN (1,2,3))– the tradingMethod field can only take values 1,2,3
CHECK(changeProbability <= 100 AND changeProbability > 0)– the changeProbability field can take integer values in the range from 1 to 100

Relationships between tables

To analyze the description of relationships between tables, let's look at the creation of share_rates:


CREATE TABLE share_rates(
	id BIGINT AUTO_INCREMENT PRIMARY KEY,
	operDate datetime NOT NULL,
	share BIGINT NOT NULL,
	rate DECIMAL(15,2) NOT NULL
);
ALTER TABLE share_rates ADD FOREIGN KEY (share) REFERENCES shares(id)

Part 2. DBMS structure, tables and data types - 4

A reference to the values of another table can be set as follows: ALTER TABLEtable_from_which_referred ADD FOREIGN KEY(field_that_referred) REFERENCEStable_to_which_referred (field_that_referred to) Let in shares we have records on shares, for example, for id=50 we store Microsoft shares with an initial price of 17.5, a delta of 20 and a chance of change of 4%. For the share_rates table we get three main properties:

We only need to store the value of the id key from the shares table in the share field in order to use it to obtain the remaining information (name, etc.) from the shares table.
We cannot create a rate for a non-existent promotion. You cannot insert a non-existent value into the share field (for which there is no record in the shares table with this id), since there will be no correspondence between the tables.
We cannot delete a share entry in shares for which rates are set in share_rates.

The last two points serve to ensure the integrity of the stored data. You can see the creation of SQL tables of our emulation and examples of SQL queries in the Java implementation of methods of the corresponding classes using the link to the github repository at the end of the article. The third part

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION