JavaRush /Java Blog /Random EN /Zookeeper or how a zoo worker lives

Level 3

8 August 2023
232 views
0 comments

Zookeeper or how a zoo worker lives

Introduction

Applications in Java often have different configurations. For example, the address and connection port. For example, it might look like this if the Properties class were used :

public static void main(String []args) {
	Properties props = new Properties();
	props.setProperty("host", "www.tutorialspoint.com");
	System.out.println("Hello, " + props.getProperty("host"));
}

And this seems to be enough, because we can get Properties from a file. And it seems that everything on one machine gets along well. But imagine that our system begins to consist of different systems that are separated from each other? Such a system is also called distributed (Distributed Systems). In Wikipedia, you can find the following definition: Distributed systems are systems whose components are located on different network computers that communicate with each other and coordinate their actions by exchanging messages with each other. You can take a look at the following diagram:

With this approach, a single system is divided into components. Configuration is a separate shared component. Each of the other components acts as a client for the configuration component. This case is called " Distributed configuration ". There are many different implementations of distributed configuration. And in today's review, I propose to get acquainted with one of them, which is called Zookeeper.

Zookeeper

The path to getting to know Zookeeper starts from their official website: zookeeper.apache.org On the official website, go to the " Download " section. In this section, download the archive in .tar.gz format, for example "zookeeper-3.4.13.tar.gz". tar is an archive format traditional for Unit systems. gz means that gzip is used to compress the archive. If we are working on a Windows machine, then this should not bother us. Most modern archivers (for example, 7-zip), perfectly able to work with them on Windows. Let's extract the contents to some directory. At the same time, we will see the difference - on the disk in the extracted state, it will take about 60 megabytes, and we downloaded an archive about 35 megabytes in size. As you can see, compression really works. Now you need to start Zookeeper. In general, Zookeeper is a kind of server. Zookeeper can be run in one of two modes: Standalone or Replicated . Consider the simplest option, it is also the first option - Standalone mode. In order for Zookeper to run, it needs a configuration file. Therefore, let's create it here: [КаталогРаспаковкиZookeeper]/conf/zoo.cfg. For Windows, we will use the recommendation from Medium : " Installing Apache ZooKeeper on Windows". The contents of the configuration file will be something like this:

tickTime=2000
dataDir=C:/zookeeper-3.4.13/data
clientPort=2181

Let's add the ZOOKEEPER_HOME environment variable containing the path to the zookeper root directory (as in the instructions for medium), and also add the following fragment to the PATH environment variable: Also, the directory ;%ZOOKEEPER_HOME%\bin; specified in dataDir must exist, otherwise Zookeeper will not be able to start the server. Now we can safely start the server with the command: zkServer. By adding the Zookeeper directory to the path environment variable, we can call Zookeper commands from anywhere, not just from the bin directory.

ZNode

As stated in " Zookeeper Overview ", data in Zookeper is represented as ZNodes , which are grouped into a tree structure. That is, each ZNode can contain data and have child ZNodes. You can read more about the ZNode organization in the Zookeeper documentation: " Data model and the hierarchical namespace ". To work with Zookeeper and ZNode, we will use the Zookeeper CLI (Command Line interface - command line interface). Earlier we started the server using the zkServer command. Now, to connect, let's execute. zkCli.cmd -server 127.0.0.1:2181 If successful, a connection session to Zookeeper will be created and we will see something like the following output:

Interestingly, even immediately after installation, Zookeeper already has a ZNode. It has the following path:/zookeeper/quota

These are the so-called " quotas ". As stated in " Apache ZooKeeper Essentials ", each ZNode can have a quota associated with it that limits the data it can store. A limit on the number of znodes and on the amount of data stored can be specified. In this case, if this limit is exceeded, then the operation with ZNode is not canceled, but a warning about exceeding the limit will be received. Read more about ZNode in " ZooKeeper Programmer's Guide : ZNodes ". A few examples from there, how you can work with ZNode:

I would also like to note that ZNode are different. Regular ZNodes (unless you specify additional flags) are of type persistent . There is a ZNode of type " Ephemeral Node ". Such ZNodes exist only for the duration of the Zookeeper connection session in which they were created. There is a ZNode of type " Sequence Node ". These ZNodes are appended with a sequence number to ensure uniqueness. Sequence Node can be either persistent or ephemeral. About ZNode, a little background information is also recommended here: " Zookeeper ZNodes - Characteristics & Example ".

ZNode Watcher

I would like to talk more about observers (watchers). Details about them are written in Zookeeper's documentation: " ZooKeeper Watches ". In short, a watcher is a one-time trigger that fires on some event. By getting data, by performing getData() , getChildren() , or exists() operations, we can create a trigger as an additional action. Zookeeper enforces the order in which event is processed. In addition, the documentation states that before we can see the new ZNode value, we will see an event about changing the old value to the new one. You can read more about Watchers here: " ZooKeeper Watches – Features & Guarantees ". In order to try this out, let's use the CLI again.: Suppose we have some ZNode with a value where we store the status of some service:

[zk: 127.0.0.1:2181(CONNECTED) 0] create /services/service1/status stopped
Created /services/service1/status
[zk: 127.0.0.1:2181(CONNECTED) 1] get /services/service1/status [watch]
stopped

Now, if the data in /services/service1/statuschanges, then our one-time trigger will work:

Interestingly, when connected to Zookeeper, we also see how the watcher works:

WATCHER::
WatchedEvent state:SyncConnected type:None path:null

SyncConnected is one of the possible Zookeper events. More details about it can be found in the API description.

Zookeeper or how a zoo worker lives - 10

Zookeeper and Java

We now have some basic understanding of what Zookeeper can do. Let's now work with it through Java, and not through the CLI. And for this we need a Java application, on which we will see how to work with Zookeeper. To create an application, we will use the Gradle project build system . With the help of " Gradle Build Init plugin " let's create the project. To do this, run the command:gradle init --type java-application If Gradle will ask us clarifying questions, then we will leave the default values \u200b\u200b(just press Enter). Now let's open the build script, i.e. build.gradle file. It contains a description of what our project is made of and what artifacts (libraries, frameworks) it depends on. Because we want to use Zookeeper, we need to add it. Therefore, let's add the Zookeeper dependency to the dependencies block:

dependencies {
    implementation 'org.apache.zookeeper:zookeeper:3.4.13'

You can read more about Gradle in the review: " A brief introduction to Gradle ". So, we have a Java project, we connected the Zookeeper library to it. Let's write something now. As we remember, using the CLI, we connected like this: zkCli.cmd -server 127.0.0.1:2181 Let's declare the "server" attribute in the App class in the main method:

String server = "127.0.0.1:2181";

Connection is not an instant action. We will have to somehow wait in the main thread of the program for the connection to occur. Therefore, we need a lok. Let's declare it below:

Object lock = new Object();

Now we need someone to say that the connection is established. As we remember, when we did this through the CLI, the watcher worked for us. So in Java code, everything is exactly the same. Our watcher will display a success message and notify everyone waiting through the lock. Let's write a watcher:

Watcher connectionWatcher = new Watcher() {
	public void process(WatchedEvent we) {
		if (we.getState() == Event.KeeperState.SyncConnected) {
			System.out.println("Connected to Zookeeper in " + Thread.currentThread().getName());
			synchronized (lock) {
            	lock.notifyAll();
            }
		}
	}
};

Now let's add a connection to zooKeeper's server:

int sessionTimeout = 2000;
ZooKeeper zooKeeper = null;
synchronized (lock) {
	zooKeeper = new ZooKeeper(server, sessionTimeout, connectionWatcher);
	lock.wait();
}

Everything is simple here. When executing the main method in the main thread of the program, we capture lock and request a connection to zookeeper. In doing so, we release the lock, and wait until someone else takes over the lock and notifies us that we can continue. When the connection is established, the watcher will work. It will check that the event has arrived - SyncConnected (as we remember, it was the watcher that caught it through the CLI), and then it will write a message. Next, we capture the lock (because the main thread released it earlier) and notify all threads waiting for the lock that we can continue. The event processing thread exits the synchronized block, thereby releasing the lock. The main thread received a notification and, after waiting for the lock to be released, continues execution, because until it receives a lock, it will not be able to exit the synchronized block and continue working. Thus, using multithreading and the Zookeeper API, we can perform various actions. The Zookeeper API is much broader than the CLI allows. For example:

// Creation нового узла
String znodePath = "/zookeepernode2";
List<ACL> acls = ZooDefs.Ids.OPEN_ACL_UNSAFE;
if (zooKeeper.exists(znodePath, false) == null) {
	zooKeeper.create(znodePath, "data".getBytes(), acls, CreateMode.PERSISTENT);
}

// Получение данных из узла
byte[] data = zooKeeper.getData(znodePath, null, null);
System.out.println("Result: " + new String(data, "UTF-8"));

As you can see, when creating a node, we can set up an ACL. This is another important feature. ACLs are permissions that apply to actions with the ZNode. There are many settings, so for details I recommend to refer to the official documentation: " Zookeeper ACL Permissions ".

Zookeeper or how a zoo worker lives - 11

Conclusion

Why are we reading this? Because Zookeeper is also used in other popular technologies. For example, Apache Kafka requires Zookeeper, which you can read about in the " Kafka Quick Start Guide ". It is also used in the NOSQL HBase database, as you can read more about in their " HBase Quickstart Guide ". In fact, many other projects use Zookeeper. Some of them are listed in " Using Zookeeper in the Real World ". I hope I answered the "why" question. The big question now is: What's next? First, you can read the following books on the topic of Apache Zookeeper:

" ZooKeeper: Distributed Process Coordination ", Flavio Junqueira, Benjamin Reed
" Apache ZooKeeper Essentials "

Secondly, there are excellent video reports about Zookeeper. Recommended for viewing:

Seminar in Yandex: " What is the use of ZooKeeper for developers "
ZooKeeper intro
Centralized Application Configuration with Spring and Apache ZooKeeper

Thirdly, there are several useful articles that will complement the picture of the world:

A very short review has been released, but as an introductory word, I hope it will be useful. #Viacheslav

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION