What on earth is dynamic instrumentation code coverage and why you may want it?
Java code coverage tools, like these embedded in IDEs or provided as CI environments plugins are great, but they have one limitation – the tests you run have to also be written in Java or other JVM language. What if you have suites of tests in other, non JVM languages and would like to know what is covered and what is not?
I’ve faced such an issue – we had a really big suite of e2e REST API tests written in Python, and executed them against big Java application running on Tomcat. We wanted to track, where do these tests go in the code. But how to check it?
Possible solution is so called dynamic instrumentation of Java code using Java agents. Shortly speaking Java agent triggers in the moment of classloading and transforms class bytecode to save some statistics about executed lines. Magic applied in practice.
According to my research there are two reasonable solutions providing (among other features) the dynamic instrumentation :
- Jacoco – having an agent mode; recognizable by some people (4090 Stack Overflow matches as of February 2019 ) , not necessarily due to it’s capabilities for outside-java-triggered execution, but for its Maven plugins integrated with Jenkins plugins, Sonar plugins, etc. I used it in some of previous projects to observe code coverage from release to release, so it was my first shot. Disappointingly for this use case it didn’t work well. More on it in the Why not Jacoco section
- JCov – having an agent mode; developed as a tool targeted at Java Java developers (not a typo, I mean people actually developing Java, yep). Probably originating in the depths of Oracle basements. Known by a small group of unicorns (9 Stack Overflow posts in three years and one reasonable presentation). You have to checkout and build it yourself. I went there, I’m alive, and I’m coming with a practical tutorial.
I used it for generating code coverage reports while running e2e Python tests, but of course with such flexible agent you can track coverage of the code called anyhow you want. Postman suite? No problem. Clicking around the GUI wondering how the heck it works inside without staring at the debugger? Here you go.
Dynamic vs static instrumentation
As I mentioned before, the instrumentation (process of altering the bytecode of examined classes) may be dynamic, and this is the flavour I used, but it can also be static. With static instrumentation you mutate the class files before you run them to gather coverage statistics. In comparison to dynamic instrumentation it makes execution faster (as there is no overhead on classloading) and it consumes much less memory resources in general. JCov also supports static instrumentation, I haven’t tried it though, as dynamic mode was more suitable for me. You can find more information on static instrumentation mode in the linked JavaOne presentation from JCov developers themselves and also in the indispensable verbose help mode built in the jcov.jar
Setup and operating instructions for JCov in dynamic mode for Tomcat application
Building the tool itself
- Clone jcov mercurial repo :
hg clone http://hg.openjdk.java.net/code-tools/jcov
- You will need following Jar dependencies built/downloaded manually to build JCov (I post exact versions which I used, maybe other will work too) :
/build/build.propertiesfile, setting paths to aforementioned jars.
- Go to
/builddirectory and execute ‘
ant’ command. If you got all paths well this should finally build
Using the JCov tool.
jcov.jarto server startup configuration in Java agent mode, exposing commands server
- Start the application server
- Command the agent to dump data to file (to get rid of server startup Java coverage from statistics )
- Run cases where coverage interests us, by any means on the application server (anyhow, for instance python nosetests, postman suite, even manual clicking around the GUI or browser automation plugin)
- Dump the coverage data again to file
- Generate HTML or other report from coverage data
jcov.jarto server startup configuration in java agent mode, exposing commands server :
It is not that easy to start with JCov. Actually you have to build it yourself using manually downloaded dependencies, as site with released versions doesn’t really work. I have built the final
jcov.jar for you, and also I provided intermediate dependencies copied to my repository to save you from hassle looking up the dependencies over the net. But I still recommend to download the sources yourself, as with documentation being scarce going to the sources can show you some hidden functionalities. At least that was my case.
If you want to prepare it yourself go as follows :
These are official building instructions, although they seem to be a bit ‘vintage’, as JCov supports modern Java versions, and readme still refers to JDK5. I built it on JDK8.
I have found them in the various places on the web. You can also download them from my github repository.
# path to asm libraries
asm.jar = /asm-7.0.jar
asm.tree.jar = /asm-tree-7.0.jar
asm.util.jar = /asm-util-7.0.jar
# path to javatest library (empty value allowed if you do not need jtobserver.jar)
javatestjar = /javatest.jar
jcov.jar embeds all functions needed to generate coverage report, from instrumenting classes to compiling it in .html (or other format) report itself.
The JCov tool has many sub-tools and usage scenarios – what’s below is a concrete case where dynamic instrumentation is used to generate and dump coverage data on the fly from application server which was not pre-instrumented before running.
The phases to generate such report are as follows.
Now let’s go through these phases in details:
In the case of Tomcat server the configuration went as follows :
Let’s explain the params :
file=important-coverage-data.xml,merge=gensuff– the root of filename where the coverage data will be dumped to. The ‘gensuff’ merge option makes JCov create separate file for coverage data every time dumping is requested, with a random suffix. There are also other modes of operation, for instance overwriting existing data in xml disk file, or merging its content. For my case the separate files with suffixes seemed to be the most practical mode
Log, log.level– you know what it is
include=com\.mycompany*– makes java agent instrument only classes in selected packages. By default all classes are instrumented, but beware of that – it may cause excessive memory consumption, even OOM errors for bigger systems. Most probably you’re not interested in coverage data for the external libraries anyway. Note that you can use multiple include and exclude statements to have more sophisticated configuration (refer to
jcov.jarbuilt in help for more information)
agent.port=3336– in this configuration simple command accepting server will be started inside agent, waiting for signal to dump coverage data for file. Other options are just dumping data to file on VM exit, or using so called ‘Grabber’ module (read further)
Start the server however you do it in normal operation – just make sure it picks up configuration from point 1. In this concrete configuration you will see that it works if something listens on localhost/3336.
agent.port configuration, commanding to dump current coverage statistics to XML file turns out to be really crude.
Connect to the server using :
telnet localhost 3336
Every time you send ‘save’ string to server it will save next part of coverage data to xml file and respond with ‘saved’ string to your telnet session.
Of course you can automate it, if you need to dump the excessive amounts of data periodically, even in bash, for instance
(sleep 1; echo "open localhost 3336"; sleep 1; while :; do echo "save"; sleep 60; done ) | telnet
Or any other telnet client integrated with your testing environment.
This will dump new portion of data every minute.
Remember you can merge the results later with Merge command from
As in step three. My practice was to dump data just before running the suite, and just after, and then using just the second file to calculate coverage report.
To generate the report in HTML navigable format, along with the covered lines marked in class sources, execute command like
java -jar jcov.jar RepGen -sourcepath /path/to/module/one/src:/path/to/module/two/src /path/to/coverage-data.xml
This will generate, by default, fully navigable HTML report in /report directory below where
Troubles I’ve encountered and you may also
Apart from a lot of time I’ve spent on trivial things, which you don’t have to, after I’ve figured it out – like having to built it with good old Ant or figuring how to signal Agent to dump contents to file, there are some problems which you still may stumble upon:
- Out Of Memory Errors / HeapSize
don’t be surprised if your application throws OOMs at you or starts to act strangely. This may be GC dying there. Instrumenting the classes and storing execution statistics may use quite a lot of memory. I had to increase my HeapSize a lot to startup a big set of Tomcat applications, even after including only my company classes in the filter. Also it happened to me that if I haven’t applied ANY filter (just took all the classes) the JVM crashed sometimes instantly at startup. But with filters everything was alright.
- Multiple source trees
it was covered in tutorial but I would like to mention it again. Help for jcov.jar mentions just source location – singular. But in fact you can add multiple directories separated with your operating system separator (you can easily find it in JCov sources, going from the parameter name).
- Spring/CGLib bytecode altering
sometimes Spring resorts to generate class proxies which apparently don’t just nicely delegate calls to old implementations but dynamically subclass them. Effectively this means that the original class is lost in favor of newly generated one. You will see its statistics in the report, but without relation to original .java source. Another reason to reconsider using Java class inheritance nowadays!
- Grabber module
In this article I show how to use server embedded in agent, but in theory, according to JavaOne presentation the standard way to do it is using so called Grabber server, to which the Java agent connects. I’ve successfully set up the Grabber server, but the client didn’t seem to work, so I resorted to server embedded in agent.
- Untouched classes
In this example, only loaded classes are instrumented, so remember that if the class is not covered by tests at all, you won’t see it in report. If you have such need, you have to generate so called ‘template’ beforehand.
Ok, what next?
-hv is your friend
JCov is quite mysterious tool in terms of community and documentation, but it’s really worth to note, that it has quite reasonable help built in the jcov.jar itself.
It has many features which go beyond this introduction. For instance merging existing coverage data files, filtering results, even generating diffs from build to build.
Start from typing :
java -jar jcov.jar
You will get a description of the modules of jcov.jar. You can get verbose info about every part by -hv option which stands for ‘help-verbose’. For instance :
Java -jar jcov.jar Agent -hv
It is also worth to explore the source code itself near places where command line parameters are processed.
Also take a detailed look at JavaOne presentation I mentioned.
Why not Jacoco?
At first Jacoco seemed to be better choice, with much higher adoption, bigger community, IDE and CI integrations, but at least in my case it just basically didn’t work. After importing Jacoco coverage data to IntelliJ it was clear, that the lines which for sure were executed were marked as not executed in the report, which beats the purpose. Maybe it was the faulty plugin. I tried to generate report outside the IDE, using Jacoco tools themselves but.. The raport crashes when there are two classes of the same name in different packages, which for as big codebase as in mine case made this tool useless. I have to stress though, that I have previously had good experiences with this product while using Maven/Jenkins plugins for tests executed in JVM so for you it may be worth giving a shot.