Wednesday, December 7, 2016

Creating a Test Ensemble with ZooKeeper and VirtualBox

What is ZooKeeper? It's an Apache project creating a server that allows distributed information and message storage for distributed processes. If you have a number of servers that need to coordinate certain information, ZooKeeper might be useful, especially if your project uses Java.

The interface is very reminiscent of a simple filesystem, using "znodes" as files that can have sub-znodes containing more information. In addition to passing and storing small bits of information (less than a megabyte per node, as I recall), znodes can also be set to ephemeral, so when a server connects to the ZooKeeper ensemble (cluster), a znode is registered and other servers can find that server. When the server goes offline, the znode disappears, so your application can be informed (if it sets a watch on that znode) if that server is no longer available to the cluster or it can search for the available servers (znodes) before connecting to that system.

That's the simplest overview.

I created 3 ZooKeeper nodes using VirtualBox.

I first installed 1 Ubuntu-Server VM. I named it Cluster1 for the VM name and hostname, used Ubuntu Server 64 bit with version 16.10. The VM had an 8GB drive (sparse drive so it didn't eat all the space on my workstation right away), 1GB RAM and bridged ethernet.

While running through the install I added the SSH server package when prompted.

Once the VM was running I ran
sudo apt-get update
sudo apt-get upgrade

I also ended up running
sudo apt-get dist-upgrade

At that point it no longer had packages to update nor packages held back.

I shut down the VM and told VirtualBox to clone it. The first was named Cluster2 and the second clone became Cluster3. During the clone wizard step-through I told VB to reinitialize the MAC addresses on the network cards and do a full clone so these are independent VMs.

I fired up Cluster2 and changed the hosts file and hostname file in /etc to reflect the fact that the machine is cluster2, not cluster1, then repeated the step for Cluster 3. A restart of the two machines should now show the proper names for the machines.

Now I have 3 small servers running. In each of them, I ran
sudo apt-get install zookeeper

I edited the /etc/zookeeper/conf/myid file so cluster1 had the value 1, cluster2 had the value 2, and cluster3 had 3. In the /etc/zookeeper/conf/zoo.cfg file, I added the IP's for each of the three machines reflecting the 1,2, and 3 values, like so (just for the specify zookeeper servers section):
server.1=192.168.254.1:2888:3888
server.2=192.168.254.2:2888:3888
server.3=192.168.254.3:2888:3888

I used the IP's for each server because I didn't edit any hosts file or local DNS to allow finding these ZooKeeper servers by name, although it could certainly be done. On the other hand, using the IP means no DNS lookup, so I might have shaved a few milliseconds off communications.

The default install didn't have any service scripts, so "service zookeeper restart" leaves Ubuntu scratching it's head at you. Install some add-on scripts using:
sudo apt-get install zookeeperd

At this point I can run
sudo service zookeeper restart
sudo service zookeeper status

A basic ensemble (or cluster) should now be running!

How do you test this...or at least do something with it? There's a Java CLI tool included with ZooKeeper, but it turns out there's a bug where a particular environment variable isn't set. It's not a big deal...just set it before trying to run the tool.
export JAVA=java

Now you can run the tool. This will launch it, and connect to a local server instance.
/usr/share/zookeeper/bin/zkCli.sh -server 127.0.0.1:2181

From here, you can use the "help" command to get a list of available commands. To just kick the tires a little, I ran these commands:
ls /
create /zk_test My_Data
ls /
get /zk_test
set /zk_test test
get /zk_test
delete /zk_test
ls /
quit

And as I ran through the list of commands (creating the zk_test znode, seeing the data stored as the string "My_Data", setting the data to "test", and finally deleting the znode) I would list and set information from different VMs to see that the data was synchronizing properly.