JavaRush /Java Blog /Random EN /Part 2. DBMS structure, tables and data types
Marat Sadykov
Level 41

Part 2. DBMS structure, tables and data types

Published in the Random EN group
First part
Part 2. DBMS structure, tables and data types - 1
We continue to create our simple stock exchange emulator. Here's what we'll do:
  • Let's create a database organization diagram.
  • We will describe what, how and where it is stored.
  • Let's find out how data is related to each other.
  • Let's start learning the basics of SQL using the example of the SQL table creation command CREATE TABLE , Data Definition Language ( DDL ) of the SQL language.
  • Let's continue writing the Java program. We implement the main functions of the DBMS in terms of java.sql for creating our database programmatically, using JDBC and a three-tier architecture.
These two parts turned out to be more voluminous, since we need to familiarize ourselves with the basics of SQL and the organization of a DBMS from the inside, and draw analogies with Java. In order not to bore you with code listings, at the end there are links to the corresponding commit github repository with the program.

DBMS design

Application Description

You've already heard that organizing data storage is an integral part of programming. Let me remind you that the purpose of our application is the simplest exchange emulation:
  • There are shares whose value can change during the trading day according to given rules;
  • there are traders with initial capital;
  • traders can buy and sell shares according to their algorithm.
The exchange operates in ticks - fixed periods of time (in our case - 1 minute). During a tick, the stock price may change, and then the trader may buy or sell shares.

Exchange emulation data structure

Let's call individual exchange entities models. To avoid rounding errors, we will work with financial amounts through a class BigDecimal(details can be found in the link at the end of the article). Let's describe the structure of each model in more detail: Promotion:
Attribute Type Description
name Srting Name
changeProbability int Probability of rate change as a percentage on each tick
startPrice BigDecimal Initial cost
delta int The maximum amount in percentage by which the current value can change
Share price:
Attribute Type Description
operDate LocalDateTime Time (tick) for setting the rate
share Promotion Link to promotion
rate BigDecimal Share price
Trader:
Attribute Type Description
name String Time (tick) for setting the rate
sfreqTick int Frequency of transactions. Specified by the period, in ticks, after which the trader performs operations
cash BigDecimal Amount of money other than shares
traidingMethod int The algorithm used by the trader. Let's set it as a constant number, the implementation of the algorithm will be (in the following parts) in Java code
changeProbability int Probability of completing the operation, percentage
about String Probability of rate change, in percentage, on each tick
Trader actions:
Attribute Type Description
operation int Transaction type (buy or sell)
traider Trader Trader link
shareRate Share price Link to the stock price (respectively, the stock itself, its rate and the time it was issued)
amount Long Number of shares involved in the transaction
To ensure the uniqueness of each model, we will add an attribute idof type long . This attribute will be unique within model instances and will uniquely identify it. Attributes that reference other models (trader, stock, stock price) can use this one idto uniquely identify the corresponding model. The thought immediately comes to mind that we could use Map<Long, Object> to store such data, where Objectis the corresponding model. However, try implementing this in code under the following conditions:
  • the data size significantly exceeds the amount of available RAM;
  • access to data is expected from a dozen different places;
  • the ability to simultaneously modify and read data is required;
  • it is necessary to ensure rules for the formation and integrity of data;
...and you will be faced with tasks that require proper qualifications and time to implement. There is no need to “reinvent the wheel”. Much has already been thought out and written for us. So we will use what has already been tested over the years.

Storing Data in Java

Let's consider the action. In Java, we created a specific class for this model Sharewith fields name, changeProbability, startPrice, delta. And many shares were stored as Map<Long, Share>, where the key is a unique identifier for each share.

public class Share {
    private String name;
    private BigDecimal startPrice;
    private int changeProbability; 
    private int delta;   
}
Map<Long, Share> shares = new HashMap<>();
shares.put(1L, new Share("ibm", BigDecimal.valueOf(20.0), 15, 10));
shares.put(2L, new Share("apple", BigDecimal.valueOf(14.0), 25, 15));
shares.put(3L, new Share("google", BigDecimal.valueOf(12.0), 20, 8));
...
shares.put(50L, new Share("microsoft", BigDecimal.valueOf(17.5), 10,4 ));
To access the desired promotion by ID, use the method shares.get(id). For the task of finding a stock by name or price, we would loop through all the records looking for the one we need, and so on. But we will go the other way and store the values ​​in the DBMS.

Data storage in a DBMS

Let us formulate an initial set of data storage rules for a DBMS:
  • Data in a DBMS is organized into tables ( TABLE ), which are a set of records.
  • All records have the same sets of fields. They are set when creating the table.
  • The field can be set to a default value ( DEFAULT ).
  • For a table, you can set constraints ( CONSTRAINT ) that describe the requirements for its data to ensure their integrity. This can be done at the table creation stage ( CREATE TABLE ) or added later ( ALTER TABLE ... ADD CONSTRAINT ).
  • The most common CONSTRAINT :
    • The primary key is PRIMARY (Id in our case).
    • Unique value field UNIQUE (VIN for the vehicle table).
    • Checking the CHECK field (percentage value cannot be greater than 100). One of the private restrictions on a field is NOT NULL or NULL , which prohibits/allows storing NULL in a table field.
    • Link to a third-party table FOREIGN KEY (link to a stock in the stock price table).
    • Index INDEX (indexing a field to speed up the search for values ​​in it).
    • Modification of a record ( INSERT , UPDATE ) will not occur if the values ​​of its fields contradict the restrictions (CONSTRAINT).
  • Each table can have a key field (or several) that can be used to uniquely identify a record. Such a field (or fields, if they form a composite key) forms the primary key of the table - PRIMARY KEY .
    • The primary key ensures the uniqueness of a record in the table; an index is created on it, which gives quick access to the entire record based on the key value.
    • Having a primary key makes it much easier to create links between tables. Next, we will use an artificial primary key: for the first record id = 1, each subsequent record will be inserted into the table with the id value increased by one. This key is often called AutoIncrement or AutoIdentity .
Actually, a table of stocks: Part 2. DBMS structure, tables and data types - 2 Is it possible to use the stock name as a key in this case? By and large - yes, but there is a possibility that some company issues different shares and calls them only by its own name. In this case, there will no longer be uniqueness. In practice, an artificial primary key is used quite often. Agree, using a full name as a unique key in a table containing records of people will not ensure uniqueness. As well as using a combination of full name and date of birth.

Data types in DBMS

Like any other programming language, SQL has data typing. Here are the most common SQL data types: Integer types
SQL type SQL synonyms Matching in Java Description
INT INT4,INTEGER java.lang.Integer 4-byte integer, -2147483648 … 2147483647
BOOLEAN BOOL, BIT java.lang.Boolean True, False
TINYINT java.lang.Byte 1-byte integer, -128 … 127
SMALLINT INT2 java.lang.Short 2-byte integer, -32768 … 32767
BIGINT INT8 java.lang.Long 8-byte integer, -9223372036854775808 … 9223372036854775807
AUTO_INCREMENT INCREMENT java.lang.Long An incremental counter unique to the table. If a new value is inserted into it, it is increased by one. The generated values ​​are never repeated.
Real
SQL type SQL synonyms Matching in Java Description
DECIMAL(N,M) DEC, NUMBER java.math.BigDecimal Fixed precision decimal (N integer digits and M fractional digits). Mainly designed for working with financial data.
DOUBLE FLOAT8 java.lang.Double Double precision real number (8 bytes).
REAL FLOAT4 java.lang.Real Single precision real number (4 bytes).
String
SQL type SQL synonyms Matching in Java Description
VARCHAR(N) NVARCHAR java.lang.String UNICODE string of length N. Length limited to 2147483647 Loads the entire contents of the string into memory.
date and time
SQL type SQL synonyms Matching in Java Description
TIME java.time.LocalTime, java.sql.Time Storing time (up to nanoseconds), when converting to DATETIME, the date is set to January 1, 1970.
DATE java.time.LocalDate, java.sql.Timestamp Storing dates in yyyy-mm-dd format, time is set as 00:00
DATETIME TIMESTAMP java.time.LocalDateTime, java.sql.Timestamp Storing date + time (without taking into account time zones).
Storage of large volumes of data
SQL type Matching in Java Description
BLOB java.io.InputStream, java.sql.Blob Storing binary data (pictures, files...).
CLOB java.io.Reader, java.sql.Clob Storing large text data (books, articles...), unlike VARCHAR, loads data into memory in portions.

SQL writing style

For many languages, there are code formatting guidelines. Typically, such documents contain rules for naming variables, constants, methods and other language structures. So, for Python there is PEP8, for Java - Oracle Code Conventions for Java . Several different sets have been created for SQL, which are slightly different from each other. Regardless, you should develop the habit of following rules when formatting your code, especially if you work in a team. The rules could be, for example, the following (of course, you can develop a different set of rules for yourself, the main thing is to stick to them in the future):
  • Keywords and reserved words, including commands and operators, must be written in capital letters: CREATE TABLE, CONSTRAINT...
  • The names of tables, fields and other objects should not coincide with SQL language keywords (see the link at the end of the article), but may contain them.
  • Table names should reflect their purpose. They are written in lowercase letters. Words in the name are separated from each other by underscores. The word at the end must be in the plural : traders (traders), share_rates (share rate).
  • Table field names should reflect their purpose. They must be written in lowercase letters, the words in the name must be formatted in Camel Case style , and the word at the end must be used in the singular : name (name), share_rates (share rate).
  • Artificial key fields must contain the word id.
  • CONSTRAINT names must follow table naming conventions. They must also include the fields and tables involved in them, begin with a semantic prefix: check_ (checking the field value), pk_ (primary key), fk_ (foreign key), uniq_ (field uniqueness), idx_ (index). Example: pk_traider_share_actions_id (primary key on the id field for the trader_share_actions table).
  • And so on, as you study SQL, the list of rules will be replenished/changed.

DBMS design

Immediately before creating a DBMS, it needs to be designed. The final schema contains tables, a set of fields, CONSTRAINT, keys, default conditions for fields, relationships between tables and other database entities. On the Internet you can find many free online/offline designers for designing small DBMSs. Try typing something like “Database designer free” into a search engine. Such applications have useful additional properties:
  • Can generate SQL commands to create a DBMS.
  • Visually display the settings on the diagram.
  • Allows you to move tables for better visualization.
  • Show keys, indexes, relationships, default values, and the like on the diagram.
  • They can remotely store the DBMS schema.
For example, dbdiffo.com highlights keys, shows non-empty fields and AI (AutoIncrement) counters with the NN label:
Part 2. DBMS structure, tables and data types - 3

Creating tables in a DBMS

So we have a diagram. Now let's move on to creating tables (CREATE TABLE). To do this, it is advisable for us to have preliminary data:
  • table name
  • field names and type
  • restrictions (CONSTRAINTS) on fields
  • default values ​​for fields (if available)
  • primary key (PRIMARY KEY) if available
  • connections between tables (FOREIGN KEY)
We will not study in detail all the options of the CREATE TABLE command; we will look at the basics of SQL using the example of creating a table for traders:

CREATE TABLE traiders( 
	id BIGINT AUTO_INCREMENT PRIMARY KEY,
	name VARCHAR(255) NOT NULL,
	freqTiсk INTEGER NOT NULL,
	cash  DECIMAL(15,2) NOT NULL DEFAULT 1000,
	tradingMethod INTEGER NOT NULL,
	changeProbability INTEGER NOT NULL DEFAULT 50,
	about VARCHAR(255) NULL
);
ALTER TABLE traiders ADD CONSTRAINT check_traiders_tradingMethod 
	CHECK(tradingMethod IN (1,2,3));
ALTER TABLE traiders ADD CONSTRAINT check_traiders_changeProbability 
	CHECK(changeProbability <= 100 AND changeProbability > 0)
Let's take a closer look:
  • CREATE TABLE traiders(field description) - creates a table with the specified name; in the description, the fields are separated by a comma. Any command ends with a semicolon.
  • The field description begins with its name, followed by its type, CONSTRAINT, and default value.
  • id BIGINT AUTO_INCREMENT PRIMARY KEY– the id field of an integer type is a primary key and an incremental counter (for each new record for the id field, a value will be generated that is one more than the previously created one for this table).
  • cash DECIMAL(15,2) NOT NULL DEFAULT 1000– cash field, decimal, 15 digits before the decimal point and two after (financial data, for example, dollars and cents). Cannot accept NULL values. If no value is given, it will get the value 1000.
  • about VARCHAR(255) NULL– the about field, a string up to 255 characters long, can accept empty values.
Note that we can set part of the CONSTRAINT conditions after creating the table. Let's consider the construction for modifying the table structure and its fields: ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK (condition) using examples:
  • CHECK(tradingMethod IN (1,2,3))– the tradingMethod field can only take values ​​1,2,3
  • CHECK(changeProbability <= 100 AND changeProbability > 0)– the changeProbability field can take integer values ​​in the range from 1 to 100

Relationships between tables

To analyze the description of relationships between tables, let's look at the creation of share_rates:

CREATE TABLE share_rates(
	id BIGINT AUTO_INCREMENT PRIMARY KEY,
	operDate datetime NOT NULL,
	share BIGINT NOT NULL,
	rate DECIMAL(15,2) NOT NULL
);
ALTER TABLE share_rates ADD FOREIGN KEY (share) REFERENCES shares(id)
Part 2. DBMS structure, tables and data types - 4
A reference to the values ​​of another table can be set as follows: ALTER TABLEtable_from_which_referred ADD FOREIGN KEY(field_that_referred) REFERENCEStable_to_which_referred (field_that_referred to) Let in shares we have records on shares, for example, for id=50 we store Microsoft shares with an initial price of 17.5, a delta of 20 and a chance of change of 4%. For the share_rates table we get three main properties:
  • We only need to store the value of the id key from the shares table in the share field in order to use it to obtain the remaining information (name, etc.) from the shares table.
  • We cannot create a rate for a non-existent promotion. You cannot insert a non-existent value into the share field (for which there is no record in the shares table with this id), since there will be no correspondence between the tables.
  • We cannot delete a share entry in shares for which rates are set in share_rates.
The last two points serve to ensure the integrity of the stored data. You can see the creation of SQL tables of our emulation and examples of SQL queries in the Java implementation of methods of the corresponding classes using the link to the github repository at the end of the article. The third part
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION