First part
We continue to create our simple stock exchange emulator. Here's what we'll do:
- Let's create a database organization diagram.
- We will describe what, how and where it is stored.
- Let's find out how data is related to each other.
- Let's start learning the basics of SQL using the example of the SQL table creation command CREATE TABLE , Data Definition Language ( DDL ) of the SQL language.
- Let's continue writing the Java program. We implement the main functions of the DBMS in terms of java.sql for creating our database programmatically, using JDBC and a three-tier architecture.
These two parts turned out to be more voluminous, since we need to familiarize ourselves with the basics of SQL and the organization of a DBMS from the inside, and draw analogies with Java. In order not to bore you with code listings, at the end there are links to the corresponding commit github repository with the program.
DBMS design
Application Description
You've already heard that organizing data storage is an integral part of programming. Let me remind you that the purpose of our application is the simplest exchange emulation:
- There are shares whose value can change during the trading day according to given rules;
- there are traders with initial capital;
- traders can buy and sell shares according to their algorithm.
The exchange operates
in ticks - fixed periods of time (in our case - 1 minute). During a tick, the stock price may change, and then the trader may buy or sell shares.
Exchange emulation data structure
Let's call individual exchange entities models. To avoid rounding errors, we will work with financial amounts through a class
BigDecimal
(details can be found in the link at the end of the article). Let's describe the structure of each model in more detail:
Promotion:
Attribute |
Type |
Description |
name |
Srting |
Name |
changeProbability |
int |
Probability of rate change as a percentage on each tick |
startPrice |
BigDecimal |
Initial cost |
delta |
int |
The maximum amount in percentage by which the current value can change |
Share price:
Attribute |
Type |
Description |
operDate |
LocalDateTime |
Time (tick) for setting the rate |
share |
Promotion |
Link to promotion |
rate |
BigDecimal |
Share price |
Trader:
Attribute |
Type |
Description |
name |
String |
Time (tick) for setting the rate |
sfreqTick |
int |
Frequency of transactions. Specified by the period, in ticks, after which the trader performs operations |
cash |
BigDecimal |
Amount of money other than shares |
traidingMethod |
int |
The algorithm used by the trader. Let's set it as a constant number, the implementation of the algorithm will be (in the following parts) in Java code |
changeProbability |
int |
Probability of completing the operation, percentage |
about |
String |
Probability of rate change, in percentage, on each tick |
Trader actions:
Attribute |
Type |
Description |
operation |
int |
Transaction type (buy or sell) |
traider |
Trader |
Trader link |
shareRate |
Share price |
Link to the stock price (respectively, the stock itself, its rate and the time it was issued) |
amount |
Long |
Number of shares involved in the transaction |
To ensure the uniqueness of each model, we will add an attribute
id
of type
long . This attribute will be
unique within model instances and will uniquely identify it. Attributes that reference other models (trader, stock, stock price) can use this one
id
to uniquely identify the corresponding model. The thought immediately comes to mind that we could use
Map<Long, Object>
to store such data, where
Object
is the corresponding model. However, try implementing this in code under the following conditions:
- the data size significantly exceeds the amount of available RAM;
- access to data is expected from a dozen different places;
- the ability to simultaneously modify and read data is required;
- it is necessary to ensure rules for the formation and integrity of data;
...and you will be faced with tasks that require proper qualifications and time to implement. There is no need to “reinvent the wheel”. Much has already been thought out and written for us. So we will use what has already been tested over the years.
Storing Data in Java
Let's consider the action. In Java, we created a specific class for this model
Share
with fields
name
,
changeProbability
,
startPrice
,
delta
. And many shares were stored as
Map<Long, Share>
, where the key is a unique identifier for each share.
public class Share {
private String name;
private BigDecimal startPrice;
private int changeProbability;
private int delta;
}
Map<Long, Share> shares = new HashMap<>();
shares.put(1L, new Share("ibm", BigDecimal.valueOf(20.0), 15, 10));
shares.put(2L, new Share("apple", BigDecimal.valueOf(14.0), 25, 15));
shares.put(3L, new Share("google", BigDecimal.valueOf(12.0), 20, 8));
...
shares.put(50L, new Share("microsoft", BigDecimal.valueOf(17.5), 10,4 ));
To access the desired promotion by ID, use the method
shares.get(id)
. For the task of finding a stock by name or price, we would loop through all the records looking for the one we need, and so on. But we will go the other way and store the values in the DBMS.
Data storage in a DBMS
Let us formulate an initial set of data storage rules for a DBMS:
- Data in a DBMS is organized into tables ( TABLE ), which are a set of records.
- All records have the same sets of fields. They are set when creating the table.
- The field can be set to a default value ( DEFAULT ).
- For a table, you can set constraints ( CONSTRAINT ) that describe the requirements for its data to ensure their integrity. This can be done at the table creation stage ( CREATE TABLE ) or added later ( ALTER TABLE ... ADD CONSTRAINT ).
- The most common CONSTRAINT :
- The primary key is PRIMARY (Id in our case).
- Unique value field UNIQUE (VIN for the vehicle table).
- Checking the CHECK field (percentage value cannot be greater than 100). One of the private restrictions on a field is NOT NULL or NULL , which prohibits/allows storing NULL in a table field.
- Link to a third-party table FOREIGN KEY (link to a stock in the stock price table).
- Index INDEX (indexing a field to speed up the search for values in it).
- Modification of a record ( INSERT , UPDATE ) will not occur if the values of its fields contradict the restrictions (CONSTRAINT).
- Each table can have a key field (or several) that can be used to uniquely identify a record. Such a field (or fields, if they form a composite key) forms the primary key of the table - PRIMARY KEY .
- The primary key ensures the uniqueness of a record in the table; an index is created on it, which gives quick access to the entire record based on the key value.
- Having a primary key makes it much easier to create links between tables. Next, we will use an artificial primary key: for the first record
id = 1
, each subsequent record will be inserted into the table with the id value increased by one. This key is often called AutoIncrement or AutoIdentity .
Actually, a table of stocks:
Is it possible to use the stock name as a key in this case? By and large - yes, but there is a possibility that some company issues different shares and calls them only by its own name. In this case, there will no longer be uniqueness. In practice, an artificial primary key is used quite often. Agree, using a full name as a unique key in a table containing records of people will not ensure uniqueness. As well as using a combination of full name and date of birth.
Data types in DBMS
Like any other programming language, SQL has data typing. Here are the most common SQL data types:
Integer types
SQL type |
SQL synonyms |
Matching in Java |
Description |
INT |
INT4,INTEGER |
java.lang.Integer |
4-byte integer, -2147483648 … 2147483647 |
BOOLEAN |
BOOL, BIT |
java.lang.Boolean |
True, False |
TINYINT |
|
java.lang.Byte |
1-byte integer, -128 … 127 |
SMALLINT |
INT2 |
java.lang.Short |
2-byte integer, -32768 … 32767 |
BIGINT |
INT8 |
java.lang.Long |
8-byte integer, -9223372036854775808 … 9223372036854775807 |
AUTO_INCREMENT |
INCREMENT |
java.lang.Long |
An incremental counter unique to the table. If a new value is inserted into it, it is increased by one. The generated values are never repeated. |
Real
SQL type |
SQL synonyms |
Matching in Java |
Description |
DECIMAL(N,M) |
DEC, NUMBER |
java.math.BigDecimal |
Fixed precision decimal (N integer digits and M fractional digits). Mainly designed for working with financial data. |
DOUBLE |
FLOAT8 |
java.lang.Double |
Double precision real number (8 bytes). |
REAL |
FLOAT4 |
java.lang.Real |
Single precision real number (4 bytes). |
String
SQL type |
SQL synonyms |
Matching in Java |
Description |
VARCHAR(N) |
NVARCHAR |
java.lang.String |
UNICODE string of length N. Length limited to 2147483647 Loads the entire contents of the string into memory. |
date and time
SQL type |
SQL synonyms |
Matching in Java |
Description |
TIME |
|
java.time.LocalTime, java.sql.Time |
Storing time (up to nanoseconds), when converting to DATETIME, the date is set to January 1, 1970. |
DATE |
|
java.time.LocalDate, java.sql.Timestamp |
Storing dates in yyyy-mm-dd format, time is set as 00:00 |
DATETIME |
TIMESTAMP |
java.time.LocalDateTime, java.sql.Timestamp |
Storing date + time (without taking into account time zones). |
Storage of large volumes of data
SQL type |
Matching in Java |
Description |
BLOB |
java.io.InputStream, java.sql.Blob |
Storing binary data (pictures, files...). |
CLOB |
java.io.Reader, java.sql.Clob |
Storing large text data (books, articles...), unlike VARCHAR, loads data into memory in portions. |
SQL writing style
For many languages, there are code formatting guidelines. Typically, such documents contain rules for naming variables, constants, methods and other language structures. So, for Python there is PEP8, for
Java - Oracle Code Conventions for Java . Several different sets have been created for SQL, which are slightly different from each other. Regardless, you should develop the habit of following rules when formatting your code, especially if you work in a team. The rules could be, for example, the following (of course, you can develop a different set of rules for yourself, the main thing is to stick to them in the future):
- Keywords and reserved words, including commands and operators, must be written in capital letters: CREATE TABLE, CONSTRAINT...
- The names of tables, fields and other objects should not coincide with SQL language keywords (see the link at the end of the article), but may contain them.
- Table names should reflect their purpose. They are written in lowercase letters. Words in the name are separated from each other by underscores. The word at the end must be in the plural : traders (traders), share_rates (share rate).
- Table field names should reflect their purpose. They must be written in lowercase letters, the words in the name must be formatted in Camel Case style , and the word at the end must be used in the singular : name (name), share_rates (share rate).
- Artificial key fields must contain the word id.
- CONSTRAINT names must follow table naming conventions. They must also include the fields and tables involved in them, begin with a semantic prefix: check_ (checking the field value), pk_ (primary key), fk_ (foreign key), uniq_ (field uniqueness), idx_ (index). Example: pk_traider_share_actions_id (primary key on the id field for the trader_share_actions table).
- And so on, as you study SQL, the list of rules will be replenished/changed.
DBMS design
Immediately before creating a DBMS, it needs to be designed. The final schema contains tables, a set of fields, CONSTRAINT, keys, default conditions for fields, relationships between tables and other database entities. On the Internet you can find many free online/offline designers for designing small DBMSs. Try typing something like “Database designer free” into a search engine. Such applications have useful additional properties:
- Can generate SQL commands to create a DBMS.
- Visually display the settings on the diagram.
- Allows you to move tables for better visualization.
- Show keys, indexes, relationships, default values, and the like on the diagram.
- They can remotely store the DBMS schema.
For example,
dbdiffo.com highlights keys, shows non-empty fields and AI (AutoIncrement) counters with the NN label:
Creating tables in a DBMS
So we have a diagram. Now let's move on to creating tables (CREATE TABLE). To do this, it is advisable for us to have preliminary data:
- table name
- field names and type
- restrictions (CONSTRAINTS) on fields
- default values for fields (if available)
- primary key (PRIMARY KEY) if available
- connections between tables (FOREIGN KEY)
We will not study in detail all the options of the CREATE TABLE command; we will look at the basics of SQL using the example of creating a table for traders:
CREATE TABLE traiders(
id BIGINT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
freqTiсk INTEGER NOT NULL,
cash DECIMAL(15,2) NOT NULL DEFAULT 1000,
tradingMethod INTEGER NOT NULL,
changeProbability INTEGER NOT NULL DEFAULT 50,
about VARCHAR(255) NULL
);
ALTER TABLE traiders ADD CONSTRAINT check_traiders_tradingMethod
CHECK(tradingMethod IN (1,2,3));
ALTER TABLE traiders ADD CONSTRAINT check_traiders_changeProbability
CHECK(changeProbability <= 100 AND changeProbability > 0)
Let's take a closer look:
CREATE TABLE traiders
(field description) - creates a table with the specified name; in the description, the fields are separated by a comma. Any command ends with a semicolon.
- The field description begins with its name, followed by its type, CONSTRAINT, and default value.
id BIGINT AUTO_INCREMENT PRIMARY KEY
– the id field of an integer type is a primary key and an incremental counter (for each new record for the id field, a value will be generated that is one more than the previously created one for this table).
cash DECIMAL(15,2) NOT NULL DEFAULT 1000
– cash field, decimal, 15 digits before the decimal point and two after (financial data, for example, dollars and cents). Cannot accept NULL values. If no value is given, it will get the value 1000.
about VARCHAR(255) NULL
– the about field, a string up to 255 characters long, can accept empty values.
Note that we can set part of
the CONSTRAINT conditions after creating the table. Let's consider the construction for modifying the table structure and its fields:
ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK (condition) using examples:
CHECK(tradingMethod IN (1,2,3))
– the tradingMethod field can only take values 1,2,3
CHECK(changeProbability <= 100 AND changeProbability > 0)
– the changeProbability field can take integer values in the range from 1 to 100
Relationships between tables
To analyze the description of relationships between tables, let's look at the creation of share_rates:
CREATE TABLE share_rates(
id BIGINT AUTO_INCREMENT PRIMARY KEY,
operDate datetime NOT NULL,
share BIGINT NOT NULL,
rate DECIMAL(15,2) NOT NULL
);
ALTER TABLE share_rates ADD FOREIGN KEY (share) REFERENCES shares(id)
A reference to the values of another table can be set as follows:
ALTER TABLE
table_from_which_referred
ADD FOREIGN KEY
(field_that_referred)
REFERENCES
table_to_which_referred (field_that_referred to) Let in
shares we have records on shares, for example, for id=50 we store Microsoft shares with an initial price of 17.5, a delta of 20 and a chance of change of 4%. For the
share_rates table we get three main properties:
- We only need to store the value of the id key from the shares table in the share field in order to use it to obtain the remaining information (name, etc.) from the shares table.
- We cannot create a rate for a non-existent promotion. You cannot insert a non-existent value into the share field (for which there is no record in the shares table with this id), since there will be no correspondence between the tables.
- We cannot delete a share entry in shares for which rates are set in share_rates.
The last two points serve to ensure the integrity of the stored data. You can see the creation of SQL tables of our emulation and examples of SQL queries in the Java implementation of methods of the corresponding classes using the link to the github repository at the end of the article.
The third part
GO TO FULL VERSION