Thursday, May 31, 2012

Insight to Thread Dump

Java VM overview

The Java virtual machine is really the foundation of any Java EE platform. This is where your middleware and applications are deployed and active.

The JVM provides the middleware software and your Java / Java EE program with:

- A runtime environment for your Java / Java EE program (bytecode format)
- Several program features and utilities (IO facilities, data structure, Threads management, security, monitoring etc.)
- Dynamic memory allocation and management via the garbage collector

Your JVM can reside on many OS (Solaris, AIX, Windows etc.) and depending of your physical server specifications, you can install 1...n JVM processes per physical / virtual server.



JVM and Middleware software interactions

Find below a diagram showing you a high level interaction view between the JVM, middleware and application(s).


This is showing you a typical and simple interaction diagram between the JVM, middleware and application. As you can see, the Threads allocation for a standard Java EE application are done mainly between the middleware kernel itself and JVM (there are some exceptions when application itself or some APIs create Threads directly but this is not common and must be done very carefully).

Also, please note that certain Threads are managed internally within the JVM itself such as GC (garbage collection) Threads in order to handle concurrent garbage collections.

Since most of the Thread allocations are done by the Java EE container, it is important that you understand and recognize the Thread Stack Trace and identify it properly from the Thread Dump data. This will allow you to understand quickly the type of request that the Java EE container is attempting to execute.

From a Thread Dump analysis perspective, you will learn how to differentiate between the different Thread Pools found from the JVM and identify the request type.

This last section will provide you with an overview of what is a JVM Thread Dump for the HotSpot VM and the different Threads that you will find. Detail for the IBM VM Thread Dump format will be provided in the part 4.
Please note that you will find the Thread Dump sample used for this article from the root cause analysis forum.

JVM Thread Dump – what is it?

A JVM Thread Dump is a snapshot taken at a given time which provides you with a complete listing of all created Java Threads.

Each individual Java Thread found gives you information such as:

Thread name; often used by middleware vendors to identify the Thread Id along with its associated Thread Pool name and state (running, stuck etc.)

      - Thread type & priority ex: daemon prio=3 ** middleware softwares typically create their Threads as daemon meaning their Threads are running in background; providing services to its user e.g. your Java EE application **

       - Java Thread ID ex: tid=0x000000011e52a800 ** This is the Java Thread Id obtained via java.lang.Thread.getId() and usually implemented as an auto-incrementing long 1..n**

       - Native Thread ID ex: nid=0x251c** Crucial information as this native Thread Id allows you to correlate for example which Threads from an OS perspective are using the most CPU within your JVM etc. **

       - Java Thread State and detail ex: waiting for monitor entry [0xfffffffea5afb000] java.lang.Thread.State: BLOCKED (on object monitor)
** Allows to quickly learn about Thread state and its potential current blocking condition **

        - Java Thread Stack Trace; this is by far the most important data that you will find from the Thread Dump. This is also where you will spent most of your analysis time since the Java Stack Trace provides you with 90% of the information that you need in order to pinpoint root cause of many problem pattern types as you will learn later in the training sessions

        - Java Heap breakdown; starting with HotSpot VM 1.6, you will also find at the bottom of the Thread Dump snapshot a breakdown of the HotSpot memory spaces utilization such as your Java Heap (YoungGen, OldGen) & PermGen space. This is quite useful when excessive GC is suspected as a possible root cause so you can do out-of-the-box correlation with Thread data / patterns found
1Heap
2PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
3eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
4from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
5to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
6PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
7object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
8PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
9object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)
Thread Dump breakdown overview

In order for you to better understand, find below a diagram showing you a visual breakdown of a HotSpot VM Thread Dump and its common Thread Pools found:

As you can there are several pieces of information that you can find from a HotSpot VM Thread Dump. Some of these pieces will be more important than others depending of your problem pattern(problem patterns will be simulated and explained in future articles).

For now, find below a detailed explanation for each Thread Dump section as per our sample HotSpot Thread Dump:

# Full thread dump identifier
This is basically the unique keyword that you will find in your middleware / standalong Java standard output log once you generate a Thread Dump (ex: via kill -3 for UNIX). This is the beginning of the Thread Dump snapshot data.
1Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode):
# Java EE middleware, third party & custom application Threads
This portion is the core of the Thread Dump and where you will typically spend most of your analysis time. The number of Threads found will depend on your middleware software that you use, third party libraries (that might have its own Threads) and your application (if creating any custom Thread, which is generally not a best practice).

In our sample Thread Dump, Weblogic is the middleware used. Starting with Weblogic 9.2, a self-tuning Thread Pool is used with unique identifier “'weblogic.kernel.Default (self-tuning)”
1"[STANDBY] ExecuteThread: '414' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=3 tid=0x000000010916a800 nid=0x2613 in Object.wait() [0xfffffffe9edff000]
2   java.lang.Thread.State: WAITING (on object monitor)
3        at java.lang.Object.wait(Native Method)
4        - waiting on <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
5        at java.lang.Object.wait(Object.java:485)
6        at weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:160)
7        - locked <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
8        at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)
# HotSpot VM Thread
This is an internal Thread managed by the HotSpot VM in order to perform internal native operations. Typically you should not worry about this one unless you see high CPU(via Thread Dump & prstat / native Thread id correlation).
1"VM Periodic Task Thread" prio=3 tid=0x0000000101238800 nid=0x19 waiting on condition
# HotSpot GC Thread
When using HotSpot parallel GC (quite common these days when using multi physical cores hardware), the HotSpot VM create by default or as per your JVM tuning a certain # of GC Threads. These GC Threads allow the VM to perform its periodic GC cleanups in a parallel manner, leading to an overall reduction of the GC time; at the expense of increased CPU utilization.
1"GC task thread#0 (ParallelGC)" prio=3 tid=0x0000000100120000 nid=0x3 runnable
2"GC task thread#1 (ParallelGC)" prio=3 tid=0x0000000100131000 nid=0x4 runnable
3………………………………………………………………………………………………………………………………………………………………
This is crucial data as well since when facing GC related problems such as excessive GC, memory leaks etc, you will be able to correlate any high CPU observed from the OS / Java process(es) with these Threads using their native id value (nid=0x3). You will learn how to identify and confirm this problem is future articles.

# JNI global references count
JNI (Java Native Interface) global references are basically Object references from the native code to a Java object managed by the Java garbage collector. Its role is to prevent collection of an object that is still in use by native code but technically with no "live" references in the Java code.

It is also important to keep an eye on JNI references in order to detect JNI related leaks. This can happen if you program use JNI directly or using third party tools like monitoring tools which are prone to native memory leaks.
1JNI global references: 1925
# Java Heap utilization view
This data was added back to JDK 1 .6 and provides you with a short and fast view of your HotSpot Heap. I find it quite useful when troubleshooting GC related problems along with HIGH CPU since you get both Thread Dump & Java Heap in a single snapshot allowing you to determine (or to rule out) any pressure point in a particular Java Heap memory space along with current Thread computing currently being done at that time. As you can see in our sample Thread Dump, the Java Heap OldGen is maxed out!
1Heap
2 PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
3  eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
4  from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
5  to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
6 PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
7  object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
8 PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
9  object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)
I hope this article has helped to understand the basic view of a HotSpot VM Thread Dump.The next article will provide you this same Thread Dump overview and breakdown for the IBM VM.


Reference
http://www.javacodegeeks.com

1 comment:

Unknown said...

fastThread applies these intelligent patterns on the thread dumps and instantly identifies root causes of the problem thread dump analyser