Like many developers who spend a lot of time working with JVM languages, I spend a lot of time having to actually debug running applications. Throughout my interactions with my colleagues over the years, I've learned that not everyone is aware of the great tools that ship with the JDK (and some that don't) that can help you debug your application.
Specifically, the focus here is on applications that shine when you can't actually attach a debugger. Perhaps this is when you're running on a remote machine, or if you're looking at performance issues where stepping though a debugger wouldn't be as helpful.
Almost all of these applications are already installed if you've got the JDK on your system and are super easy to try out.
I'm going to walk through some of the applications I find to be most valuable:
All of these applications have detailed man pages, and plenty of online resources, but I wanted to throw all of them in one place as an overall primer for how to approach profiling JVM applications if you're unfamiliar with how to go about it.
Jps: List all java processes
Whenever I log onto a machine, the first command I run, almost
instinctively is jps. As long as the JDK is on your path, you'll have
jps. Jps is just a more convenient version of
ps, geared toward specifically the
java applications on your system.
<17-03-04 15:09>hkothari@hkothari-linux:~$ jps 22370 Main 24560 SparkSubmit 30230 Jps
This just tells you what java processes are running as your user on this machine. It's also got some sweet optional args, like "-m" to print out all the args passed into the main class, or "-v" to print out all of the arguments, including jvm arguments:
22370 Main -Xbootclasspath/a:/usr/share/intellijidea-ce/lib/boot.jar -Xms128m -Xmx750m -XX:ReservedCodeCacheSize=240m -XX:+UseConcMarkSweepGC -XX:SoftRefLRUPolicyMSPerMB=50 -ea -Dsun.io.useCanonCaches=false -Djava.net.preferIPv4Stack=true -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -Dawt.useSystemAAFontSettings=on -Dsun.java2d.renderer=sun.java2d.marlin.MarlinRenderingEngine -XX:ErrorFile=/home/hkothari/java_error_in_IDEA_%p.log -XX:HeapDumpPath=/home/hkothari/java_error_in_IDEA.hprof -Didea.paths.selector=IdeaIC2016.3 -Djb.vmOptionsFile=/usr/share/intellijidea-ce/bin/idea64.vmoptions -Didea.platform.prefix=Idea -Didea.jre.check=true 32153 Jps -Dapplication.home=/usr/lib/jvm/java-8-jdk -Xms8m
It's a lot more convenient than trying to understand which java processes are running by parsing through the output of "ps" which can contain a lot more information which may be overwhelming.
For more details check out the oracle man page
Jstack: Inspect running threads
If you've ever wanted to inspect what part of the code is running right now in
your JVM, you might have already used
jstack if not, this is the tool to do just that!
Invocation of jstack is fairly simple, just pass in a PID and it will output a ton of useful information:
<17-03-06 11:23>hkothari@hkothari-linux:~$ jps 9730 Main 6826 Jps <17-03-06 11:23>hkothari@hkothari-linux:~$ jstack 9730 2017-03-06 11:23:32 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.66-b17 mixed mode): "Attach Listener" #343 daemon prio=9 os_prio=0 tid=0x00007f87d0019000 nid=0x1b27 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "JobScheduler FJ pool 2/4" #341 daemon prio=4 os_prio=0 tid=0x00007f8771972800 nid=0x1a8e waiting on condition [0x00007f877ad80000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e3562f58> (a java.util.concurrent.ForkJoinPool) at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) "JobScheduler FJ pool 0/4" #336 daemon prio=4 os_prio=0 tid=0x00007f878c494000 nid=0x198c waiting on condition [0x00007f87ac3f5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e3562f58> (a java.util.concurrent.ForkJoinPool) at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) ... "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f881013d000 nid=0x260b in Object.wait() [0x00007f87f16fd000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x00000000e0dad6d8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f881013b000 nid=0x260a in Object.wait() [0x00007f87f17fe000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157) - locked <0x00000000e0e920f0> (a java.lang.ref.Reference$Lock) "VM Thread" os_prio=0 tid=0x00007f8810136000 nid=0x2609 runnable "Gang worker#0 (Parallel GC Threads)" os_prio=0 tid=0x00007f881001c000 nid=0x2604 runnable "Gang worker#1 (Parallel GC Threads)" os_prio=0 tid=0x00007f881001d800 nid=0x2605 runnable "Gang worker#2 (Parallel GC Threads)" os_prio=0 tid=0x00007f881001f000 nid=0x2606 runnable "Gang worker#3 (Parallel GC Threads)" os_prio=0 tid=0x00007f8810021000 nid=0x2607 runnable "Concurrent Mark-Sweep GC Thread" os_prio=0 tid=0x00007f8810064000 nid=0x2608 runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007f881019d800 nid=0x2612 waiting on condition JNI global references: 7704
I won't spend too much disecting the output, but basically it gives a list of all the threads, whether they are daemon or non-daemon threads, where they are in their execution and if they're currently running (RUNNABLE) or if they're waiting (WAITING) on a lock.
Taking a few of these may show you where your application is sitting. For example if your application is doing most of it's work in the "main" thread you can search the jstack for the "main" thread and look at where in the stack it's sitting. Taking a few more jstacks over some time window and seeing if it's stuck in the same place might tell you whether or not it's getting stuck (nothing changes), or if it's just running slowly (and you'll see the stacktrace change).
If your program is proceeding along but running slowly, manually cross-referencing jstacks will only get you so far. This is where JConsole, JVisualVM and Yourkit, all discussed below, can help more.
Jmap: Inspect allocated memory
I'd suspect a lot fewer people are familiar with Jmap. It's another awesome tool that ships with the JVM and allows you to inspect the heap of a running JVM. So, while it won't tell you where your code is running, it will tell you what memory is currently allocated in your JVM. It's got a few useful modes, but the two I find myself using are:
-histoprints out a histogram of the different classes allocated in memory. This will give you a rough idea of the distribution of what objects are using up your memory.
-dumpperforms a full dump of what's in memory which can be loaded up into tools like yourkit, eclipse memory analyzer and some other tools to see that actual objects that are loaded.
The histogram option is useful to get a quick peek into the allocated memory, so I find myself often using it when I find an application unexpectedly using more memory than usual. The output look like the following:
<17-03-06 11:23>hkothari@hkothari-linux:~$ jps 9730 Main 13212 Jps <17-03-06 11:36>hkothari@hkothari-linux:~$ jmap -histo 9730 | head -n 15 num #instances #bytes class name ---------------------------------------------- 1: 358188 44471896 [C 2: 318396 17299712 [Ljava.lang.Object; 3: 86017 15843752 [B 4: 83668 9482544 [I 5: 305326 7327824 java.lang.String 6: 60767 4985712 [Lscala.collection.mutable.HashEntry; 7: 45690 4985576 java.lang.Class 8: 154260 4936320 java.util.Hashtable$Entry 9: 67161 4298304 com.intellij.psi.impl.source.tree.CompositeElement 10: 103291 3305312 java.util.concurrent.locks.AbstractQueuedSynchronizer$Node 11: 101949 3262368 java.util.concurrent.SynchronousQueue$TransferStack$SNode 12: 131832 3163968 scala.Tuple2
The types are a little bit hard to interpret, but for example the first line item states that the most amount of heap in this process is "[C" which stands for an array ("[") of characters ("C"), with ~40MB of space. This makes sense because I was jstacking my IntelliJ process, but in some cases you can catch something unexpected here.
In most cases looking at the histogram matches your expectation and you're forced to do deeper searching for what's going wrong. Taking
a full heap dump using the
-dump flag will let you copy the entire application's memory to a file and then open it up with other tools
to look at the actual objects within memory. From here you may be able to key in on specific objects that are getting too large or being
created too often. The outputs of a heap dump can be read by
yourkit and others.
It is worth noting that a heap dump will take up as much disk space as the heap size itself, so if you have a 60GB heap
for example, make sure you also have 60GB of disk available where you're dumping it.
Once again, like jstack this is justs a snapshot, so while it gives you some insight, it may not be enough and you may want to hop over to something which allows you to watch over time. JConsole/JVisualVM and Yourkit will enable that.
JConsole/JVisualVM/JMC: All of the above tools in a GUI
JConsole, JVisualVM and JMC are three other tools that ship with the JDK which are more full featured visual equivalents of the tools above. For the most part they share a lot of functionality, so I generally use JVisualVM or JMC.
With all of these tools you can list and connect to local Java processes and then collect/view JVM, heap and stack information, see the JVM enviroment and potentially execute code via MBeans.
JConsole is a little bit less full featured, so I'll focus on JVisualVM.
One of the big other capabilities JVisualVM provides on top of the command line tools listed above are a sampler and profiler for the thread and heap information provided by jstack/jmap. Once you've selected a JVM, you can use the sampler or profiler to take snapshots of the running threads over time to see where time is being spent. This is super useful for tracking down performance bottlenecks and understanding.
It also collects charts that let you watch heap size over time, garbage collection time over time, etc.
Finally, they provide an easy interface for accessing JMX for the process you've connected to. JMX provides a lot of the charts we just mentioned by exposing methods that can be called locally/remotely on the JVM to output information about the internals. Plus if you've written your own JMX beans in your application you can access them via JConsole/JVisualVM. This allows some really handy RPC opportunities that you may find useful.
A few notes on the differences between the 3 tools:
- JConsole has heap/GC charts and interacting directly with MBeans, but doesn't have a sampler or profiler.
- JVisualVM has all the same information as JConsole, minus MBeans but with a profiler/sampler.
- If you're not using MBeans directly, which you probably aren't JVisualVM is usually more useful.
- If you are using MBeans, there's JMX plugin which you can download, but I've never tried it.
- JMC is a newer tool, it's got a sort of clunkier interface, but it has what appears to be all the information in the other tools, as well as the ability to record statistics for viewing later and the ability to set up alerts when certain events take place.
Honestly, if you're unfamiliar with these tools, I'd strongly encourage you to open up JVisualVM and just give it a try, inspecting the JVisualVM process itself if you don't have anything else running. It ends up being incredibly handy when you're running into issues so knowing how to use it can go a long way.
This is by no means a comprehensive list of all of the tools available to you, these
are just the ones I find myself using the most.
A quick pass through your JDK
bin/ folder might show other tools which you could
make use of.
jmap, etc, I'd recommend trying them out and taking a look
at their man pages to get an idea of the parameters that are available.
Some tools which I left out but you might find useful:
jstat- Lets you collect GC statistics from a running JVM in the same way as unix's
jhat- Command line tool for analyzing heap dumps.
JVM Args for Profiling
Also, along with tools that you can use to inspect running JVMs, there are a ton of JVM options you can set when you start up the application to log more information during runtime in the event of failure. Some favorites here are:
-XX:+HeapDumpOnOutOfMemory- forces the JVM to save off a heap dump when the program exists with an out of memory. This allows for much easier post-mortem of what filled up the heap.
-Xloggc:- followed by a file path, it will log all GC events to that file so you can track how much time is spent in GC. This is typically used with some of the other PrintGC arguments mentioned on the page linked below.
-XX:+ShowMessageBoxOnError- Shows an error dialog when the JVM exits due to an exception which allows you to attach a debugger, jstack or perform other inspection operations.
You can find more documentation on these parameters on Java's website.
Finally, I couldn't write a post on profiling without talking about one of my favorite tools: Yourkit. Yourkit's Java Profiler is one of the most useful tools across the board when I'm trying to chase down performance problems or bugs at runtime.
The tool is a primarily a profiler, which it's very good at. Especially useful is it's CPU tab which lets you look at overall hotspots in the JVM, similar to the way JVisualVM does, but also allows you to see the aggregated stack traces. Some examples of what this looks like can be found on the feature page Generally the JVM tools can key you into the right area but sometimes they can throw you off by telling you not useful things like "all of your time is being spent in a Socket.read" when in reality you might have a ton of these in your application and you want to know which code path specifically is spending time there.
One of the other incredibly handy features it has, which I haven't been able to find anywhere else is that it can keep a very detailed event timeline in for your JVM logging when threads are created or started, sockets are opened or closed as well as sources for all these events. So for example, I was able to use it to chase down which thread started a non-daemon thread which was preventing my application from exiting on my own. This is something that you can't get from any of the other tools listed above.
You can find a full list of features and more information on their website.
If you've made it this far, thanks for reading through. and if you have any questions or suggestions, my email can be found on the about page.