Zabbix Java Template

 

So far we have manually entered all JMX queries that we wanted to use. This gets very tedious very quickly. On top of that, how do we know what is important and what is not?


This page shows you how to use the Zapcat templates to quickly set up a monitor for the most vital statistics of your Java application. The discussion then dissects the template and shows you what each monitored value means and how you may want to interpret the data you receive.




Importing Templates

The Zapcat distribution includes the templates in a directory named ‘templates’. You can upload these to your Zabbix server. Note that the templates were written for Zabbix version 1.4.2 and I have not tested them on any other version.


Locate the file named ‘Template_Java.xml’ in the folder named ‘templates’ in the Zapcat distribution. Log into Zabbix as administrator. Click on ‘Configuration’ and ‘Import/Export’. From the drop-down list in the top right corner, select ‘Import’.


Applying Templates

Using templates is as straightforward as it is to import them. When entering a host configuration, use the “Add” button to link your host definition with templates.


From the pop-up window, select the Java template that you just imported and press “Select” to close the pop-up.


Check that the template is going to be linked and press “Save” to make the changes.


If you go to the “Monitoring” tab in Zabbix now, you will see the data flow in from the JVM. The template includes some standard triggers ad graphs that help you make sense of what is happening inside the JVM.


Dissecting the Java Template

Well then. We now have a JVM instrumented with Zapcat, told Zabbix where it is and applied the Java template to get some important graphs right away. The only question that is left is to find out what it all means. This section takes you through each of the items in the template and explains their significance.


Memory Pools and Garbage Collectors

I am not going to delve very deep into Java’s memory management strategies here. Instead, I’d like to point you to Sun’s collection of white papers. There you will find many excellent articles that explain Java’s internals better than I ever can. I found the paper named “Memory Management in the Java HotSpot™ Virtual Machine” particularly insightful.


Global Strategy

When a JVM starts, it picks the garbage collector strategy that it will use for the remainder of its life. This decision is based on the command line flags, the amount of memory and number of processors that you have equipped your server with.


There are three strategies that the JVM can choose from. These are shown in the table below. For each strategy there are two garbage collectors; a cheap one and an expensive one.


The cheap garbage collectors do the grunt work. They are relatively lightweight, choosing application performance over thoroughness. These garbage collectors are named “Copy”, “PS Scavenge” or “ParNew” depending on the garbage collector strategy that the JVM chose.


For each strategy there is also a specialised mark-sweep-compact garbage collector. Such garbage collectors reclaim much more memory than the cheaper ones, but may cause longer delays.


The table also shows the command line flags that can be used to force the JVM to select a specific garbage collector strategy. Unless you have good reason to do so, I would advise letting the JVM pick its own.


The Zabbix Java template provides items, graphs and triggers for all of these garbage collectors. It monitors the number of collection cycles per second and the accumulated garbage collection time for each garbage collector.


For each garbage collector strategy there is a trigger that goes off when the JVM uses the expensive garbage collector more than the cheap one. That is an indication that the JVM is fighting to free some memory for the application. In practice, when this trigger goes off, you application is already getting out of memory exceptions. Most likely it is about to start behaving erratically.


Having a trigger that says that your application died is nice, but you would probably like some form of advance warning. The template foresees in this need by also providing graphs and triggers on the various memory pools.


Generations for Monitoring

There are a large number of memory pools, some of which are interesting to monitor. We do not have to monitor the Eden spaces and the survivor spaces. While these may and will fill up completely, they are designed to do that. They simply spill over into the next generation. This leaves the old, tenured and permanent generations to monitor.


There is one memory pool that is always available, and monitored, and that is the code cache pool. This is where the JVM keeps its class data.


For each memory pool there is a graph. The red line at the top of the graph indicates the configured maximum for that memory type. The blue line indicates the actually memory that was actually requested from the operating system and the green line shows how much memory is in use.


Each of the monitored memory pools is instrumented with two triggers that serve as early warning flags about possible memory scarcity. The trigger that warns about a memory pool being fully committed tells you that Java has claimed the maximum permitted amount of memory from the operating system for that pool. This may or may not be a bad thing. Applications can run for months without suffering adverse effects from this trigger. My personal strategy is to give the JVM so much RAM that this trigger never goes off. That means I have enough RAM for day-to-day operations, but also to handle unexpected peaks.


The trigger that warns you that memory use in a certain memory pool has risen above 70% is more worrisome. Although your application may well recover if this is a temporary peak, it may also be the last warning you get before your application runs into memory limits.


Memory Leaks

Please note that the Zabbix Java template is designed to spot and alert on memory shortage. It does not help you trace the origins of a memory leak. Although you can use the memory graphs to spot memory leakage, that is not as simple as it sounds. Java applications happily gobble up all the memory they are given. Just because the memory use climbs in a steep slope does not mean you have found a memory leak.


Below is an example of what it looks like when a JVM operates under significant load, without begin stressed. In this case, it is a Tomcat application server that is configured to use up to 1GB of RAM. Notice how the JVM returns memory to the operating system when not used, as indicated by the blue line.


Below is a graph from the same application, but during a stress test. The JVM is now struggling to satisfy the application’s demand for memory and performing full garbage collects to scrape the bottom of the barrel.


As you can see, when the load is removed, the JVM recovers and recycles the extra objects necessary to handle the peak. It even finds some memory to return to the OS.


Finally, here is a screen shot of an application that is starved for memory. From this image, there is no way to tell if this is a memory leak, or if this application was just not given enough memory to do its job. Only watching your application’s behaviour over longer periods of time will tell you that. In this case, I would suggest giving the application more memory and seeing if this behaviour persists.


File Descriptors

Another hard limit on JVM’s is the number of file descriptors available to the process. I have set a trigger that warns you if the JVM consumed more than 70% of the file descriptors that are available to it.


The way to change the file descriptor limit on your Java process depends on what operating system you use. Please note that the maximum number of file descriptors is only polled once per hour from Zabbix. If you are working on changing the the number of file descriptors available to a process, increase this poll rate first so that you can see the effect more quickly. Once set, polling this item once per hour is more than enough.


The FreeBSD Java port has a buggy implementation of the operating system mbean. It yields seemingly random values for the number of open file descriptors. The maximum number of file descriptors looks fine. This means that on FreeBSD, you will have to check the open file descriptors manually. I filed a bug report, and Greg Lewis has a patch available.


The Mac OS X Java implementation does not show the number of file descriptors in use and it gives a bogus upper limit to the number of file descriptors available. On OS X, like of FreeBSD, you will have to find another way to monitor the number of file descriptors used.


Compiler

The HotSpot compiler is a large factor in the performance of your system. This is in my experience mainly due to the fact that the client compiler only uses one processor in your machine and the server compiler uses all available processors. Nowadays it is actually harder to buy a single processor server than it is to buy a machine with more than one processor. Using the client compiler on a multiprocessor machine is not a good use of the investment.


The template queries the compiler type and the accumulated compiler time. The latter is not of great interest, but the compiler name is important to check. To help with this, there is also a trigger named “jvm uses suboptimal jit compiler”. If this trigger is true, you  know that adding the flag “-server” to the Java command line is likely to give you a nice performance boost.


Class Loading

The class loaders keep track of the number of classes that are loaded and unloaded. This information is particularly interesting in situations where the application consumes a lot of non heap memory. Classes are loaded into non heap memory.


Note that JVM’s hardly ever unload classes, even if they are not actually in use anymore.


Objects Pending Finalisation

Some objects may be equipped with a finaliser method. These methods are typically used by objects to clean up resources that an object may hold in use, such as locks or database connections.


There should never be more than a handful of objects pending finalisation. In fact, you may expect this counter to be zero for most of the time. Consistent high numbers of objects in this state probably points to an application bug.


Display Variable

One trigger may strike as odd, and that is the trigger that tests if the DISPLAY environment variable is set on the JVM’s process. The reason for checking this is that setting this variable is dangerous on headless servers.


Imagine the following scenario: you log into a server using ssh. Since you are working on an X11 terminal of some kind, ssh will automatically set up a tunnel to your X11 display. You start the Java application and test it. Everything works as you expect and you log out, satisfied with a job well done. The ssh tunnel to your X11 display is torn down. Suddenly, your application starts generating error messages and fails.


The primary source for these failures is the fact that Java makes use of a display to perform graphical operations, such as generating images and graphs.


If you did the same with the DISPLAY not set, you would have noticed the problems in the server immediately, instead of after logging out.


There is a trigger named “display environment variable set” that warns you of this potential issue. The fix is usually to add the command “unset DISPLAY” to your application’s start script. It also helps to ensure that your test servers are headless if your production machines are.


Uptime

The uptime of the application is recorded in Zabbix, allowing us to get some idea of how long the application runs between restarts. The uptime is also used in a trigger that goes off when the application cannot be reached.


Versions

The exact versions of the Java VM and the Zapcat agent are recorded, allowing system administrators to verify that the correct version of these components is used on all systems.


Threads

Most server applications rely heavily on multithreading to handle requests concurrently. Using multiple threads is necessary to make use of all of the available processor cores, since one thread can only run on a single core at any given time.


Threads tend to be relatively expensive to create, and if you find that you application creates and destroys large number of threads, it may pay off to have the developers use some form of thread pooling instead. The “java.util.concurrent” package offers many convenient and efficient thread pool, executors and schedulers.

 
photo: Chanetsa Mukahananahttp://www.sxc.hu/profile/ChanMuk
if you find this interesting, terrible or just would like to know more, e-mail memailto:kjkoster@kjkoster.org?subject=
web statistics
download zapcathttp://sourceforge.net/project/showfiles.php?group_id=209024
http://java-monitor.com/