Basic Garbage Collection Analysis

Performance Problems

I am a small cog in a large company which has adopted Java in a big way.  I am often called in for firefighting activities across Java environments I have never seen before.  Most of the issues are related to stability and performance.  Many times, the lack of both stability and performance is related to the application’s consumption of memory.   JVM memory problems frequently manifest themselves in elusive ways, appearing to be a delay in middleware/ESB tiers whose times are logged within the Java application.  The problem is, the JVM went on break for a Full GC for 7 seconds.  Since the break was taken after the message was sent, but before the client started looking for the response, the middleware tier gets billed for the cleanup.  So, generally I will check memory first in environments containing Java components no matter what the symptoms appear to be.

Before digging into such problems, I also ask a few basic questions.

  • Does the application cache?
    • If so, what is the caching policy?
      • Least recently used
      • Absolute lifetime
      • Duration of a session
      • etc…
    • How much information is cached?
    • What are the keys to the cache?
  • Is the application 32 or 64 bit?
  • Does this application use JNI?
    • This means that we’ll also have to keep a close eye on real memory usage as well on 32 bit platforms.  The absolute memory limit here is 4G between native and java memory.  If native needs more, there is less available to Java.
  • Can I have the arguments given to the JVM?  If not, I’ll get them anyhow from process statistics.
    • From this, you can generally see the intent.  I tend to one of two philosophies in practice:
      • Minimalistic: Let the JVM do what it will.
      • Highly Tuned: These configurations use all known flags which seem to imply an improvement.

More often than not, engineers diagnosing memory issues are using tools such as WebSphere PMI which offer only a coarse grained view of memory usage.  They never achieve a true view of what Java is actually doing with the memory under the covers.  No matter how I rant, rave, theorize and postulate, this behavior continues.

This is where my favorite tool comes into play; jstat.

Like most good tools, jstat is very simple to use.  It will never hang or crash a production server and it doesn’t need a fancy windows based GUI to grab data, though the fancy GUI piece is where Dex comes in to fill the gap.  However, I’ll take good raw data over fancy graphs of bad data anytime.

Java Memory Layout And JStat

By far, my favorite tool for getting good garbage collection data is jstat.  It’s not fancy but it produces reliable and accurate data that I have come to trust.

Below is a diagram that shows how the memory is laid out within the JVM:

Image

Garbage collectors treat the Young and tenured generations differently.  What is important here is that young generation collections do not stop the JVM.  Full Garbage Collections which collect across the Tenured Generation do pause the JVM.  The extent of this pause varies from garbage collector to collector.

On unix systems the jstat command I use to collect data is:

jstat -gc -t $PID 15s > jstat.log 2>&1 &

Where $PID is the process id of the JVM I am collecting.

  • -gc gives me the garbage collection statistics.
  • -t gives me a timestamp measured in seconds since the JVM was started.
  • $PID is the process id of the JVM I am monitoring
  • 15s tells jstat to sample every 15 seconds.
  • < jstat.log stores the output in a file called jstat.log
  • 2>&1 tells unis to send errors to the same place standard out is going.  In this case, jstat.log.
  • & tells Unix to run this process in the background.

I also sometimes nohup the process if I plan to log out and check on it another day.

nohup jstat -gc -t $PID 15s > jstat.log 2>&1 &

Once jstat is executed, it will output information every 15 seconds like the following:

Timestamp,S0C,S1C,S0U,S1U,EC,EU,OC,OU,PC,PU,YGC,YGCT,FGC,FGCT,GCT
1921.1,448.0,448.0,0.0,0.0,523392.0,418853.0,1572864.0,496523.1,262144.0,121147.2,26,2.996,0,0.000,2.996
1936.2,448.0,448.0,0.0,0.0,523392.0,420545.1,1572864.0,496523.1,262144.0,121147.2,26,2.996,0,0.000,2.996
1951.2,448.0,448.0,0.0,0.0,523392.0,422024.8,1572864.0,496523.1,262144.0,121147.2,26,2.996,0,0.000,2.996
1966.2,448.0,448.0,0.0,0.0,523392.0,423740.0,1572864.0,496523.1,262144.0,121147.2,26,2.996,0,0.000,2.996
1981.2,448.0,448.0,0.0,0.0,523392.0,426040.5,1572864.0,496523.1,262144.0,121147.2,26,2.996,0,0.000,2.996

The fields here are:

  • Timestamp – The number of seconds since the JVM was started.
  • S0CS0 Capacity; the capacity in KB of Survivor Space 0.
  • S1CS1 Capacity; the capacity in KB of Survivor Space 1.
  • S0US0 Utilization; how much space in KB is being used in Survivor space 0.
  • S1US1 Utilization; how much space in KB is being used in Survivor space 1.
  • ECEden Capacity; the capacity, in KB, of Eden.
  • EUEden Utilization; how much space, in KB, is being used in Eden.
  • OCOld Capacity; the capacity, in KB, of the Old Generation.
  • OUOld Utilization; how much space, in KB, is being used by the Old Generation.
  • PCPerm Capacity; the capacity, in KB, of the Permanent Generation.
  • PUPerm Utilization; how much space, in KB, is being used by the Permanent Generation.
  • YGCYoung Garbage Collections; the number of young garbage collections that this JVM has performed so far.
  • YGCTYoung Garbage Collection Time; the number of seconds that this JVM has spent performing young garbage collections so far.
  • FGCFull Garbage Collections; the number of full garbage collections this JVM has performed so far.
  • FGCTFull Garbage Collection Time; the number of seconds that this JVM has spent performing young garbage collections so far.
  • GCTGarbage Collection Time; the total number of seconds that this JVM has spent performing garbage collections so far.  (YGCT + FGCT).

Understanding these values is important, otherwise, we’ll see lots of pretty graphs, but not understand how to interpret them.

Basic GC Analysis With Dex

While Dex is a general purpose tool, rather than a GC analysis tool, it is perfect for the task.  It’s the problem I use Dex most often to solve.

Within Dex, we create a flow consisting of the following:

  1. File Import – To read the jstat file off disk.
  2. View Data – To check this stage.
  3. Groovy Component – Loaded with jstatReader.groovy to parse the jstat file into columns.
  4. View Data – To check our work so far.
  5. XY Chart – To chart our data.

Steps 2 and 4 are purely for debugging.

File import

In this step we import some jstat data.

Image

View data

After execution, we check the contents of the data to ensure it looks correct.

Image

CSV Conversion Script

Here we load a premade script called jstatReader.groovy.  You are free to write your own.

Image

Once loaded, we see the script:

Image

This script

// Create a matcher for a regular expression m = (data[0][1] =~ /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/)
 // Create empty header structure header = []

// Set the data to the data minus the header row.
len = data.size() - 1
newData = data[1..len]
data = []

// Set the header to the first row of the original data
header = m[0][1..16] // For each row of data, apply the pattern and store the resultant fields in their proper place. newData.each
{
  row ->
  m =  (row[1] =~ /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/)
  data << m[0][1..16]
}

Don’t worry about it if you don’t know Groovy.  Dex is shipping with more and more out of the box ready scripts weekly.  And, if you are feeling intrepid, Groovy is a fantastic language to learn.  If you know Java already, you can use Groovy quite effectively.

Now one thing about the above script is that the script contains a regular expression which could choke a horse!  Personally, I grew up on Perl and don’t mind regular expressions and my programming style tends to leverage them when available.  They are succinct and powerful, but not everyone shares my view.  A famous quote by original Netscape developer Jamie Zawinski goes something like this:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.”  Now they have two problems.

Since I have more problems than I can count, I never hesitate to reach for a regular expression when the need arises.

View Data

After execution, we again, check the results to see if the groovy script did what we intended.  We should see columnar data now, and we do.

Image

Chart Generations

In the following line chart, we can see the behavior across the generations.  The chart is a little busy because each generation is mapped on top of one another.  We will talk more about this, and fix this issue later.  Needless to say, this is difficult to read as it stands.

Image

Chart GC Count

Charting Time versus FGC or Full GC Count, we see only 2 collections over the whole time.  This VM appears well behaved.

Image

Overlapping Generations

Previously, our graph looked pretty busy.  It would be nice if we could map the generations to their location in memory as defined in the original JVM memory map.  Again, the Groovy Script component comes to our rescue.

We load one of the prefab components gcStatistics.groovy,

Image

The groovy code looks like this after loading the script.

Image

// Store the initial time offset because we want to create a
// new timestamp which begins with zero, we'll need this in
// the calculation.
timeOffset = Double.parseDouble(data[0][0])

// Copy existing headers into new additional headers with a C suffix
// indicating that the field is a Calculated field.
header += header[0..10].collect{ it.toUpperCase() + "C" }

// For each row of data:
data.eachWithIndex
{
 // Remember it's value and index into the data array.
  row, i ->
 // Transform the row of strings into a row of doubles.
  drow = row.collect { Double.parseDouble(it) }

 // Grab the original values we need for our calculations
 // from the row of doubles.
  (timestampc, s0c, s1c, s0u, s1u, ec, eu, oc, ou, pc, pu) = drow[0..10]
  timestampc -= timeOffset

 // Perform our calculations here
  s0cc = s0c
  s1cc = s0cc+s0c
  s0uc = s0u
  s1uc = s0cc + s1u
  ecc = ec + s1cc
  euc = eu + s1cc
  occ = ecc + oc
  ouc = ecc + eu
  pcc = occ + pc
  puc = occ + pu

 // Append the calculations to our data 
  [timestampc, s0cc, s1cc, s0uc, s1uc, ecc, euc, occ, ouc, pcc, puc].each { data[i] << it.toString() }
}

Final Output

Now that we have applied our Groovy Script, the graph is much easier to read:

With a little more attention to detail, we can create something worthy of dropping into a powerpoint.  Here I add a title and axis labels via the configuration tab at the left to produce:

Final Chart

Where To?

In subsequent posts, we’ll use Dex to explore some less well-behaved JVMs and hone our Java Memory analysis skills with Dex to change our vocabulary from saying wishy washy things like:

  • It looks like the memory usage is increasing…
  • It appears that we are doing frequent garbage collection…

To more definitive statements such as:

  • According to the regression analysis I performed on the Old Utilization, memory is increasing at a rate of X KB / hour.
  • We are performing garbage collections at a rate of 5 Full GC and 340 Young GC per hour.

You will then be able to wear your pocket protector proud as you get knowing smiles from the technical staff and blank stares from management.

You would never believe how many times I have been deceived by seeing what I wanted to see in data.  The statistical methods we will cover later will help you replace hunches and intuition with facts.

Advertisements

About patmartin

I am a coder and Data Visualization/Machine Learning enthusiast.
This entry was posted in Analysis, Dex, Java and tagged , , , , . Bookmark the permalink.

6 Responses to Basic Garbage Collection Analysis

  1. ss says:

    Hi,
    I can generate the jstat output and how do I analyse them. Where to get the dex tool and how to interpret the results any help on that?

    • patmartin says:

      I’ll include a specific example in the next release. Thought I had already. However, here is how you do it:

      1. File Input – Set up to read your jstat.log file.
      2. Groovy Script – Load the datasource/jstatReader.groovy script. This will parse the jstat log into a csv.
      3. Groovy Script (Optional) – Load the gc/gcStatisticsDeluxe.groovy script. This will create auxillary statistics you don’t get normally through jstat like regression lines.
      4. XYChart – Experiment with different views on the data to find the one that visualizes what you are trying to see most effectively. That’s what Dex is about.

      Execute the flow and you should see your data in the visual.

      As for interpreting the data, let me first refer you to jstat documenation:

      http://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jstat.html

      My scripts assume that you collected your jstat data with a command which looks something like:

      jstat -gc -t 15s > jstat.log

      Where PID is the process id of the JVM process and you can replace 15s with whatever polling interval (in seconds) that you desire.

      Hope this helps,

      – Pat

      • ss says:

        Dear Pat,
        Where is the link to download the latest version of the tool itself. I know how to run the jstat command but what option is your tool supporting just the gc or gcutil etc? So can the tool highlight just with one output or need more output sample?

  2. patmartin says:

    Latest alpha here:

    http://www.dexvis.com/downloads/Dex.0.6a.zip

    Or the 0.5 release here: (Either will do what you want)

    http://www.dexvis.com/downloads/Dex.0.5.zip

    Dex isn’t necessarily a garbage collection tool but a data analysis tool which is very useful for GC analysis. You can use it to do whatever you want with whatever data you have. My canned scripts handle the -gc jstat data as-is, but you can easily retool it to handle gcutil or whatever specific jstat data you would like to process.

    If you want, email me your jstat data and I’ll put a quick flow together that visualizes the data. Let me know which version you decide to install and I can email you back a the dex flow which analyzes that data. You’ll only have to load it up and point the reader over to the location of the jstat file on your disk. You’ll then have a working example.

    My email is patrick145@msn.com

    – Pat

  3. Pingback: java 메모리 이거면 가능하다. | My Blog

  4. Denzo says:

    if you’re interested in garbage collector monitor tool you can check SpyGlass Garbage Collector Analyzer:

    http://www.spyglasstools.com/documentation/spyglass-garbage-collector-analyzer/

    it is a free visualvm plugin.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s