Bytecode Instrumentation

Edited on November 23, 2016

When instrumentation is needed

The agent and process attach connectors become interesting when you get to a point where sampling isn't enough to identify the root cause of your problem. For instance, if you need to know exactly how many times a method is called within a given time frame or exactly how long each call lasts.

However, as we will be pointing out in many of our case studies, we've noticed over the years that the vast majority of the problems can be understood and diagnosed without instrumentation, assuming people would squeeze everything they can out of their stacktrace samples.

In many cases, an approximation of a given method's response time and a min-bound value of the method count can be revealed via stacktrace sampling. And again, in a lot of problems, that's exactly what you need to understand what's going on in your code.

Nevertheless, instrumentation introduces a new notion of completeness. Whether you do it out of curiosity, for general comfort or people you're tackling a problem that actually requires instrumenting methods, we've made two different connectors available to deploy our agent in a remote JVM and perform bytecode instrumentation.

The connectors

The piece of code we use to instrument java bytecodes can be loaded in two different ways: via the -javaagent (when the JVM is booting) and via Sun's Virtual Machine attach code (in memory).

The details you need to connect to a remote JVM via either one of these modes are provided on our installation page.

Javaagent

Our java agent has two roles :

  • receiving, interpreting and executing sampling and instrumentation directives from the client
  • shipping the results back to the client

The two main advantages of using "javaagent" mode are as follows :

  • you'll be able to connect to that agent from just about anywhere (provided the network routes are in working order)
  • it's fairly platform / jvm vendor / jdk minor version agnostic (the agent will work with JDK 1.6+)

The main drawback being that you need to reboot the JVM that you wish to instrument.

Process Attach

Process Attach allows us to deploy our agent code dynamically, without rebooting the target JVM or modifying any flags or options. However, in order to process attach successfully to a target JVM, your djigger client instance needs to meet the following requirements :

  • it has to run on the same host as the monitored JVM
  • it has to run on the same JRE as the target JVM
  • the tools.jar of the exact target JVM version has to be made available on the client's classpath

PS: we might work on a PA-proxy in the future if people find it useful, which would essentially eliminate the first requirement.

Instrumenting a method

After choosing one of the two instrumentation-enabled connectors, you'll notice that there's a new pane at the bottom of the djigger window. That pane will be used to keep track of which methods you've instrumented, what their statistics are and will also be the starting point for deep-dive transaction analysis.

Let's say we're starting a javaagent-enabled JVM from eclipse (of course, it could be any application) :

eclipse app

And that we're connecting locally via agent on port 13987 :

Agent connection

PS: for the purpose of providing this documentation, I'm using the BasicJMXJVM class from the collector project, available here.

Now you can see the additional instrumentation pane at the bottom :

instrumentation pane

Exact path vs All paths to node

As we support what we call contextual instrumentation, we provide two mechanisms to instrument a method. You can either decide to gather statistics on all the invocations made on that method, or only on the invocations made specifically through the code path leading to the very node you've right-clicked. That's the distinction we make between the "all paths" and "only this path" alternatives.

So go ahead and right click the "all paths" option for instance :

instrumentation right click

You'll now see that the method you've chosen is now listed on the left side of the instrumentation pane :

method listed

At this point, the djigger agent will start gathering statistics on calls made to that method.

Understanding and analyzing the results

As of djigger 1.4.2, we provide two ways of visualizing the resulting data : a statistics table and a list of the individual calls made to that method, also known as "transactions".

Statistics table

If you instrumented your method early enough (meaning if there were subsequent invocations made after you instrumented your node), you should see statistics appear in the left sub-pane about 10 or 15 seconds after having instrumented the node, and if you click the row containing these statistics, the right sub-pane will load the list of individual calls made to the node :

instrumentation statistics

Deep analysis of an individual call

Now if the method you've instrumented is generic, not all transactions going through that method might have a performance problem. That's why you might want to look into individual method calls and figure out how the response time was wasted with specific regards to that method call instance.

djigger allows you to do just that by analyzing any individual call listed on the right side of the instrumentation pane :

analyze individual call

This will open a new window containing the visualization trees you're now familiar with, except for in this window, you'll only see the stacktraces relevant to that particular method call instance.

new pane individual call

You can then use the stacktrace and node filters as you would if you were in sampling mode and if you were using the regular tree pane of djigger.

Recommendations

While sampling is a near-free operation for the JVM, instrumentation can be expensive as it's overhead is a direct function of the frequency at which the instrumented method is called in the application's code. You should keep this in mind at all times and prior to instrumenting a new node, ask yourself if that's reasonable.

However, on dev or test environments, and for debugging purposes, even if you cause high overhead, you might still want to pay that CPU fee and instrument a frequently called node.

In the end, you're responsible for what you do with the tool.