Saturday, December 1, 2007

Big Project Troubleshooting Pt. 1: Transitive Dependencies

Maven is complex, but so is Ant or Make once a project passes a certain size. With Maven many things are done in a standard way, so that when a project becomes large you still should be able to work with it easily. With Ant or Make a large project will likely have many arbitrarily named targets which you will have to study and learn. With Maven all those named targets are pretty much standardized by the plugins which provide all functionality. Large projects in Maven commonly have the same sorts of issues. These can be pretty nasty to for the uninitiated. So to help you out I'm going to write a series of articles tackling the problems faced in big Maven projects. In this article I'm going to give you some information on issues related to dependencies.

The dependency mechanism is the main reason many people become interested in Maven. It sure makes life easy... but everything cannot always be automated. Sometimes you will need to give Maven some help. It will try to correctly resolve the dependencies needed by your project by creating a graph. More on this in a second.

Maven has what is called a 'transitive dependency'. Let me explain transitive dependencies for those who don't know. A dependency of a project is defined via a project's pom.xml. Every dependency has a pom.xml (in the repository with the dependency) which is used by Maven to find the dependencies relied upon by a dependency. These 'relied upon' dependencies are also known as transitive dependencies. They are included into your projects class path automatically. Now you can have any number of levels of transitive dependencies because Maven also will get the each transitive dependency's dependencies. In Maven this forms what is called the dependency graph. Yes this is the same graph I spoke of earlier. It looks like a tree, with your project as the root node, tree level one as the dependencies of your project and subsequent levels being transitive dependencies.

Now lets talk about the sticky points. If there are two or more versions of the same dependency in the graph of the dependencies for your project then Maven will need to make a choice as to which one to use. By default Maven will use the version that is the closest to the top node of the tree. If this version is older than the other version(s) you may have a problem. Likely a NoClassDef where a class from the new dependency cannot be found, because it does not exist in the older version. When you get this error it should be obvious which dependency the error is being caused by. The stack trace should give the full package path to the class. We can use this to figure out how to fix the problem.

At this time you probably could open a terminal session and cd to your local repository location (~/.m2/repository by default) and then cd into the dependency's folder in the repository. This can be helpful since it will allow you to see which versions of the dependency are being used and which classes are in each dependency. You could execute a "jar -tf {jarname} | grep {ClassName}" command on each jar to search it for the given class.

Let us discuss troubleshooting. First you will want to run the mvn dependency:resolve command. This will print out a listing of the resolved dependencies for your project. Resolved dependencies are those calculated by Maven via the dependency graph to be used in the project. You should be able to see which version of the troubled dependency is being used. I had you cd to the dependencies repo location. Check now which versions you have in your repository. You should be able to use the "jar -tf " command to check if the class causing the NoClassDef exists in one of the newer version jars (if you have newer version jars). Once you have found a jar that contains the class make note of the version.

Now for some debugging. This is where you get waist deep in it. We are going to use the -X option to Maven, which enables debugging. We also want to run just the compile phase which will give us all the info we need on the dependency graph. You might want to do something like "mvn -X compiler:compile > junk.txt" just so that you can use a text editor like Vi to search about the file. Now when you view this output you will be able to find a listing of the absolute classpath, including with jars are being used. You also will get a printout of the dependency graph including info on what ended up making it in. It is pretty easy to find the version of a dependency being used: find the one with the least amount of indent. If one or more versions share the least amount of indent then the newest version will be selected.

To get around version problems you can do a number of things. The easiest thing to do is define the newer version of the dependency directly in the pom.xml of the project. This will make sure that the newer version is used because it is 1) at the top of the dependency tree 2) the version is highest. You can also use excludes for a dependency definition in the pom.xml file for the old version of the dependency. Since the old version is excluded the newer (yet further down the graph) version will be used. When using excludes you will need to figure out which dependency in the pom.xml file has the transitive dependency to be excluded. Neither of these fixes are tough to perform, and will become easy to do in time.