JavaRush /Java Blog /Random EN /Zookeeper or what life is like for a zookeeper
Viacheslav
Level 3

Zookeeper or what life is like for a zookeeper

Published in the Random EN group
Zookeeper or how a zoo worker lives - 1

Introduction

Java applications often have different configurations. For example, address and connection port. For example, it might look like this if we used the Properties class :

public static void main(String []args) {
	Properties props = new Properties();
	props.setProperty("host", "www.tutorialspoint.com");
	System.out.println("Hello, " + props.getProperty("host"));
}
And this seems to be enough, because... we can get Properties from the file. And everything seems to get along well with us on one machine. But imagine that our system begins to consist of different systems that are separated from each other? Such a system is also called distributed systems. On Wikipedia you can find the following definition: Distributed systems are systems whose components are located on different network computers that communicate with each other and coordinate their actions by exchanging messages with each other. You can look at the following diagram:
Zookeeper or how a zoo worker lives - 2
With this approach, a single system is divided into components. Configuration is a separate common component. Each of the other components acts as a client for the configuration component. This case is called " Distributed configuration ". There are many different implementations of distributed configuration. And in today’s review I propose to get acquainted with one of them, called Zookeeper.
Zookeeper or how a zoo worker lives - 3

Zookeeper

The path to getting to know Zookeeper begins with their official website: zookeeper.apache.org On the official website you need to go to the " Download " section . In this section, download the archive in .tar.gz format, for example “zookeeper-3.4.13.tar.gz”. tar is an archive format traditional for Unit systems. gz - means that gzip is used to compress the archive. If we work on a Windows machine, then this should not bother us. Most modern archivers (for example, 7-zip ) can work perfectly with them on Windows. Let's extract the contents into some directory. At the same time, we will see the difference - on the disk in the extracted state it will occupy approximately 60 megabytes, and we downloaded an archive of about 35 megabytes in size. As you can see, compression really works. Now you need to launch Zookeeper. In general, Zookeeper is a kind of server. Zookeeper can be run in one of two modes: Standalone or Replicated . Let's consider the simplest option, also known as the first option - Standalone mode. For Zookeper to run, it needs a configuration file. Therefore, let's create it here: [КаталогРаспаковкиZookeeper]/conf/zoo.cfg. For Windows, we will use the recommendation from Medium: “ Installing Apache ZooKeeper on Windows ”. The contents of the configuration file will be something like this:

tickTime=2000
dataDir=C:/zookeeper-3.4.13/data
clientPort=2181
Let's add the ZOOKEEPER_HOME environment variable containing the path to the zookeper root directory (as in the instructions on medium), and also add the following fragment to the PATH environment variable: ;%ZOOKEEPER_HOME%\bin; Also, the directory specified in dataDir must exist, otherwise Zookeeper will not be able to start the server. Now we can safely start the server using the command: zkServer. Thanks to the fact that the Zookeeper directory has been added to the path environment variable, we can call Zookeper commands from anywhere, not just from the bin directory.
Zookeeper or how a zoo worker lives - 4

ZNode

As stated in the " Zookeeper Overview ", data in Zookeper is represented in the form of ZNodes (nodes) that are organized into a tree structure. That is, each ZNode can contain data and have child ZNodes. More information about the ZNode organization can be found in the Zookeeper documentation: " Data model and the hierarchical namespace ". To work with Zookeeper and ZNode we will use Zookeeper CLI (Command Line interface). Previously, we started the server using the zkServer command. Now, to connect, let’s execute. zkCli.cmd -server 127.0.0.1:2181 If successful, a connection session to Zookeeper will be created and we will see approximately the following output:
Zookeeper or how a zoo worker lives - 5
Interestingly, even immediately after installation, Zookeeper already has ZNode. It has the following path:/zookeeper/quota
Zookeeper or how a zoo worker lives - 6
These are the so-called " quotas ". As stated in " Apache ZooKeeper Essentials ", each ZNode can have a quota associated with it, limiting the data it can store. A limit on the number of znodes and the amount of stored data may be specified. Moreover, if this limit is exceeded, then the operation with ZNode is not canceled, but a warning will be received about exceeding the limit. It is recommended to read about ZNode in " ZooKeeper Programmer's Guide: ZNodes ". A few examples from there of how you can work with ZNode:
Zookeeper or how a zoo worker lives - 7
I would also like to note that ZNodes are different. Regular ZNodes (unless you specify additional flags) are of type " persistent ". There is a ZNode of the " Ephemeral Node " type. Such ZNodes exist only for the duration of the Zookeeper connection session within which they were created. There is a ZNode of type " Sequence Node ". These ZNodes are appended with a number from the sequence to ensure uniqueness. Sequence Node can be either persistent or ephemeral. A little background information about ZNode is also recommended here: “ Zookeeper ZNodes – Characteristics & Example ”.
Zookeeper or how a zoo worker lives - 8

ZNode Watcher

I would also like to talk about watchers. Details about them are written in Zookeeper's documentation: " ZooKeeper Watches ". In short, a watcher is a one-time trigger that is triggered by some event. By retrieving data by performing the getData(), getChildren() or exists() operations, we can create a trigger as an additional action. Zookeeper ensures the order in which event is processed. In addition, the documentation states that before we can see the new ZNode value, we will see an event about changing the old value to the new one. You can read more about Watchers here: “ ZooKeeper Watches – Features & Guarantees ”. In order to try this, let's use the CLI again : Let's assume we have some ZNode with a value where we store the status of some service:

[zk: 127.0.0.1:2181(CONNECTED) 0] create /services/service1/status stopped
Created /services/service1/status
[zk: 127.0.0.1:2181(CONNECTED) 1] get /services/service1/status [watch]
stopped
Now, if the data /services/service1/statuschanges, our one-time trigger will fire:
Zookeeper or how a zoo worker lives - 9
It’s interesting that when connecting to Zookeeper, we also see how the watcher works:

WATCHER::
WatchedEvent state:SyncConnected type:None path:null
SyncConnected is one of the possible Zookeper events. More details about it can be found in the API description.
Zookeeper or how a zoo worker lives - 10

Zookeeper and Java

We now have some basic understanding of what Zookeeper can do. Let's now work with it through Java, and not through the CLI. And for this we will need a Java application, in which we will see how to work with Zookeeper. To create the application, we will use the Gradle project build system . Using the " Gradle Build Init plugin " we will create the project. To do this, let’s run the command: gradle init --type java-application If Gradle asks us for clarifying questions, then we’ll leave the default values ​​(just press Enter). Now let’s open the build script, i.e. build.gradle file. It contains a description of what our project is made of and what artifacts (libraries, frameworks) it depends on. Because we want to use Zookeeper, then we need to add it. Therefore, let's add a dependency on Zookeeper to the dependencies block:

dependencies {
    implementation 'org.apache.zookeeper:zookeeper:3.4.13'
You can read more about Gradle in the review: " A Brief Introduction to Gradle ". So, we have a Java project, we connected the Zookeeper library to it. Let's write something now. As we remember, using the CLI we connected something like this: zkCli.cmd -server 127.0.0.1:2181 Let’s declare the “server” attribute in the App class in the main method:

String server = "127.0.0.1:2181";
Connection is not an instant action. We will have to somehow wait in the main thread of program execution for the connection to occur. Therefore, we need a lock. Let's declare it below:

Object lock = new Object();
Now we need someone to say that the connection has been established. As we remember, when we did this through the CLI, the watcher worked for us. So in Java code everything is exactly the same. Our watcher will display a message about successful completion and notify everyone waiting about it via a lock. Let's write a watcher:

Watcher connectionWatcher = new Watcher() {
	public void process(WatchedEvent we) {
		if (we.getState() == Event.KeeperState.SyncConnected) {
			System.out.println("Connected to Zookeeper in " + Thread.currentThread().getName());
			synchronized (lock) {
            	lock.notifyAll();
            }
		}
	}
};
Now let’s add the connection to the zooKeeper server:

int sessionTimeout = 2000;
ZooKeeper zooKeeper = null;
synchronized (lock) {
	zooKeeper = new ZooKeeper(server, sessionTimeout, connectionWatcher);
	lock.wait();
}
Everything is simple here. When executing the main method in the main thread of the program, we grab the lock and request a connection to the zookeeper. At the same time, we release the lock and wait until someone else grabs the lock and notifies us that we can continue. When the connection is established, the watcher will work. He will check that the SyncConnected event has arrived (as we remember, this is what the watcher caught through the CLI), and then write a message. Next, we grab the lock (since the main thread previously released it) and notify all threads waiting for the lock that we can continue. The event processing thread exits the synchronized block, thereby releasing the lock. The main thread received a notification and, after waiting for the lock to be released, continues execution, because until it receives the lock, it will not be able to exit the synchronized block and continue working. Thus, using multithreading and the Zookeeper API, we can perform various actions. The Zookeeper API is much broader than what the CLI allows. For example:

// Creation нового узла
String znodePath = "/zookeepernode2";
List<ACL> acls = ZooDefs.Ids.OPEN_ACL_UNSAFE;
if (zooKeeper.exists(znodePath, false) == null) {
	zooKeeper.create(znodePath, "data".getBytes(), acls, CreateMode.PERSISTENT);
}

// Получение данных из узла
byte[] data = zooKeeper.getData(znodePath, null, null);
System.out.println("Result: " + new String(data, "UTF-8"));
As you can see, when creating a node we can configure an ACL. This is another important feature. ACLs are permissions that apply to actions with ZNode. There are a lot of settings, so I recommend that you refer to the official documentation for details: “ Zookeeper ACL Permissions ”.
Zookeeper or how a zoo worker lives - 11

Conclusion

Why did we read this? Because Zookeeper is used in other popular technologies. For example, Apache Kafka requires Zookeeper, which you can read about in the " Kafka Quick Start Guide ". It is also used in the NOSQL database HBase, which you can read more about in their " HBase Quickstart Guide ". In fact, many other projects use Zookeeper. Some of them are listed in " Using Zookeeper in the Real World ". I hope I answered the question “why”. The most important question now is: “What’s next?” First, you can read the following books on the topic of Apache Zookeeper: Secondly, there are excellent video reports about Zookeeper. Recommended viewing: Thirdly, there are several useful articles that will complement the picture of the world: This is a very short review, but as an introductory word, I hope it will be useful. #Viacheslav
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION