This page discusses the architecture of Zapcat and touches on a few design choices that the Zabbix administrator will have to make when Zapcat is introduced.

We will first introduce the Zabbix architecture, looking at the two fundamental ways to gather statistics. For each of these ways, we will look at the effects on Zabbix configuration and at firewall requirements. We’ll look at the organisation of hosts inside Zabbix. As an administrator you will have to choose whether Java applications show up as virtual hosts or as part of a regular machine.

We’ll then zoom in on the Zapcat agent and the Zapcat trapper to get a better understanding of how they interact with your application.

We’ll close the discussion with some performance considerations.

Zabbix Architecture

Zabbix is a tool that was primarily created for technical monitoring. It focuses on machines and on the daemons that run on those machines. You can still use it for functional monitoring, by feeding it some carefully selected data. If you are looking to kick-start the cycle of setting up and improving your monitoring, Zabbix might just be the tool for you. Being an Open Source project it is cheap to start with, has an active community behind it and the graphs look good in presentations and reports.

For those of you who are not familiar with Zabbix here is an overview of the Zabbix architecture. The basic components of Zabbix are shown in white, Java applications is shown in red and the Zapcat components are shown in green.

The user interface of Zabbix is a set of PHP pages, usually served up by an Apache web server with a PHP module. The front-end pages access the Zabbix database to retrieve the data for graphing.

The Zabbix database holds all the gathered data, plus the configuration data used by the front-end. Zabbix supports a number of different databases, so you probably get to pick one you like.

The database is fed by the Zabbix server. The server gathers the data using agents. These agents are typically installed on the machines that are being monitored, most likely one agent per machine.

First Choice: Push versus Pull

The first choice to make is what component takes initiative in statistics gathering. In a push model, the agent or the application takes initiative and sends the data to the Zabbix server.

There are two factors that influence your choice for a push or pull model; configuration requirements and network topology. The configuration of what data items to monitor lays primarily with the component that takes initiative in statistics gathering. If you opt for the push model, the configuration of what data to gather lies with the Zapcat trapper that performs the push to the server. Likewise, if the Zabbix server takes initiative to pull the statistics from the application under scrutiny, the configuration is also done in the Zabbix server.

Network topology plays a smaller role, but it may be a limiting factor if your application is behind a NAT gateway and cannot be addressed directly from the Zabbix server. If your application cannot be reached directly, using the push model may be more effective.

Below is a diagram showing the pull model. In this configuration, the configuration of what data items to monitor is done in the Zabbix server. The Zabbix server requires an open firewall port on the machine that hosts the Zapcat agent, usually port 10052.

Here is a diagram showing the push model. In this instance, the Zapcat trapper in the Java application autonomously contacts the Zabbix server and sends the statistics. The Zapcat trapper requires an open firewall port on the Zabbix monitor machine, usually port number 10051.

My personal preference is to use the pull model. I like it because it centralises the configuration of the monitoring. Centralised configuration allows my Zabbix monitoring to grow and adapt quickly and in a non-invasive manner.

For small sites or stand-alone applications decentralised configuration may be effective, but for larger installations it gets nearly impossible to change anything quickly. Zapcat makes this even more pressing, because the configuration of the Zapcat trappers is compiled into the application’s code. Thus, adding a new item to monitor requires a roll-out of a new version of your application.

Second Choice: Virtual versus Actual Hosts

Zabbix groups the statistics by the host that they came from. For application statistics you have a choice to mix them with the statistics for the host that the application runs on, or to create a virtual host to group the application statistics under.

Actually, this is not much of a choice. If you go for the pull model, as described above, you will automatically end up with a separate host with the application statistics in it. The reason being that each host in Zabbix represents one agent. On the machine that you are monitoring, there are two agents; the Zabbix agent and the Zapcat agent. Zabbix treats each of these as a separate host.

The alternative is to use the Zapcat trapper to send data to the Zabbix server. You are then free to pick the host that you want the data to show up in. Unfortunately, this forces you to decentralise configuration again.

Dissecting the Zapcat Agent

When we look inside the Java application, the Zapcat agent is not much more than a daemon thread that listens for queries from a Zabbix server. It spawns a separate query hander for each connection that a Zabbix server makes. This is to let things run smoothly if more than one Zabbix server sends queries to our agent.

It is the responsibility of the quesry handler to actually read the data from the socket connection amnd parse the query. The query handler then translates the query into calls on the JMX mbean server to service JMX queries.

The threads associated with query handlers are kept in a limited pool to keep resource usage under control while still allowing some concurrency.

Dissecting the Zapcat Trapper

Zapcat trappers look quite different from agents. They are merely a nice interface to a queue of data items that need to be sent to the Zabbix server.

At the other end of the queue there is a sender that sends each of the data items to the Zabbix server. The sender is responsible for actually performing the JMX queries, opening and closing sockets and encoding the data for transport.

Each trapper uses a timer to schedule the periodic JMX queries that are then queued up and sent to the Zabbix server.

Performance Considerations

Zabbix does not support sending more than one data item in one session. In order to send a number of data items, it will open and close a TCP socket every time. Of course, sending data like this is not free. Doing so causes the load on your servers and network to increase. You may find it pays to sit down and do some calculations on the back of an envelope to see if the additional load is justified by the new monitoring possibilities. In particular, have a look at precisely how you use the data gathered. Perhaps you can move some of the aggregation operations from the reporting into the data gathering stage and reduce the traffic.

Installing Zabbix on a dedicated machine will help keep the increase in processor load on your production machines to a minimum. An additional benefit is that you are free to run complex analyses or generate computationally expensive reports without impacting the performance of your production platform.

photo: Sanja Gjenero
if you find this interesting, terrible or just would like to know more, e-mail
web statistics
download zapcat