Thursday, December 20, 2007

Going Offline

The problem: You want Maven to stop making all those network calls when you issue it commands like clean, build, test, package etc. You might want to work on a project while you have no network connection.

The solution: You can use the off line mode for Maven to keep it from making any network calls. You can do so by issuing the -o flag to the mvn command. When using this mode Maven will need to have all the jars it needs in your local repository (~/.m2/repository by default). To make this easier you can use a trick provided by the dependency plugin. Imagine we just checked out a project. Now we want to work on this project off line. To get all the dependencies needed by the project you can issue the command "mvn dependency:go-offline". This will download the dependencies you need. Now note it will not make sure you have all the plugins you need. Say you want to run "mvn -o site:run", creating a site for your project and making it available via Jetty. You will need to have used this plugin before so that it is in your local repo. The same goes for compile, package, test and the like.

The million dollar question is why use off line mode? Well it could be that it makes quicker the process you are doing with Maven, since Maven will not try to call out to any network repos. I've timed the difference between using it and not using it for some fairly large projects and found it doesn't make a huge difference. For really big projects (I cannot really name one) it might shed enough seconds to make it worth using often, maybe even creating an alias. This 'huge' project mind you would probably have upwards of 200 jar (dependencies) or more being managed. Another reason one might use off line is as follows: you have a local network repository used by the developers at work. One day the person you administers this repository is going to move it somewhere new. This going to take a few hours. Using off line mode you could weather this time if prepared without noticing too much that the repository is down. This could also be helpful in situations where you are going to not be able to reach the repository, like working at home with no network connection, or the server hosting the repo crashed.

Sunday, December 16, 2007

Using Maven for Bash Script Projects

I recently created a Maven project unlike any I have ever created. This project contained no Java sources! No it was not a Groovy project ;) It was a project that contained only bash scripts. Read up for the why and the how.

I created a set of bash scripts that allow invoking SOAP services, returning the XML response. The SOAP services provide information on email distribution lists, such as which addresses are on a list and which lists a user's email address is subscribed to. These scripts are likely to change and grow over time so I made sure to put them into Subversion. Not too shabby. But what about distributing the scripts? What about managing versions of the scripts? I have become accustomed to having the aforementioned handled in my projects via Maven. Maven makes such things simple to set up and automate. I determined I could move these bash scripts into a Maven project to cultivate these benefits.

Now the how. For packaging I use the Maven assembly plugin. It allows created automated creation of zip tar.gz and gzip archives. These archives work great work distributing a project. I started my work by creating a skeleton Maven project. I put the scripts and related files into the src/main/scripts folder of the project. Next I created a xml file used by the assembly plugin to direct how files will be structured in the created compressed archive or archives. Here my file:

<?xml version="1.0"?>
<assembly>
<id>zimbrascripts</id>
<formats>
<format>zip</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<dependencySets>
<dependencySet>
<outputDirectory>hacksdb/lib</outputDirectory>
</dependencySet>
</dependencySets>
<fileSets>
<fileSet>
<directory>src/main/scriptapp/bin</directory>
<outputDirectory>zimbrascripts/bin</outputDirectory>
<fileMode>0755</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
<fileSet>
<directory>src/main/scriptapp/etc</directory>
<outputDirectory>zimbrascripts/etc</outputDirectory>
<fileMode>0644</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
<fileSet>
<directory>src/main/scriptapp/tokens</directory>
<outputDirectory>zimbrascripts/tokens</outputDirectory>
<fileMode>0600</fileMode>
<directoryMode>0777</directoryMode>
</fileSet>
<fileSet>
<directory>src/main/scriptapp/xml</directory>
<outputDirectory>zimbrascripts/xml</outputDirectory>
<fileMode>0644</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
</fileSets>
<files>
<file>
<source>src/main/scriptapp/README.txt</source>
<outputDirectory>zimbrascripts</outputDirectory>
</file>
</files>
</assembly>

I'll write up a more thorough article on using the assembly plugin later. For now suffice to say this file will instruct Maven to create a zip archive containing the files listed above. The last thing we want to do is configure the assembly plugin in the project's POM file. I used the following XML in my projects pom:

<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptors>
<descriptor>src/main/assembly/assembly.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make</id>
<phase>package</phase>
<goals>
<goal>assembly</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>

In the above config I also bind the assembly goal to the package phase. This means when I execute "mvn package" the assembly plugin will fire off and create the zip archive. Nice! Otherwise you can also use "mvn assembly:assembly" should do the job. Now so that I could use the scm and release plugins to check in changes to SVN and create tagged releases of the project I added the following to the POM:

<scm>
<connection>scm:svn:https://svn.com/DistributionLists/DistListScriptsMaven/trunk/</connection>
<developerConnection>scm:svn:https://svn.com/DistributionLists/DistListScriptsMaven/trunk/</developerConnection>
<url>https://svn.com/DistributionLists/DistListScriptsMaven/</url>
</scm>

Now to release the current version of the project I can type "mvn release:prepare". This will tag the current version in SVN, then update the POM to the new version number and check those changes into the SVN trunk. This saves me the hassle and reduces the chances of user error in this boring, repetitive process. In the vein of the SCM configuration you could now use other plugins such as the JIRA reporting plugin, site plugin etc. to bolster this simple bash script project.

Saturday, December 1, 2007

Big Project Troubleshooting Pt. 1: Transitive Dependencies

Maven is complex, but so is Ant or Make once a project passes a certain size. With Maven many things are done in a standard way, so that when a project becomes large you still should be able to work with it easily. With Ant or Make a large project will likely have many arbitrarily named targets which you will have to study and learn. With Maven all those named targets are pretty much standardized by the plugins which provide all functionality. Large projects in Maven commonly have the same sorts of issues. These can be pretty nasty to for the uninitiated. So to help you out I'm going to write a series of articles tackling the problems faced in big Maven projects. In this article I'm going to give you some information on issues related to dependencies.

The dependency mechanism is the main reason many people become interested in Maven. It sure makes life easy... but everything cannot always be automated. Sometimes you will need to give Maven some help. It will try to correctly resolve the dependencies needed by your project by creating a graph. More on this in a second.

Maven has what is called a 'transitive dependency'. Let me explain transitive dependencies for those who don't know. A dependency of a project is defined via a project's pom.xml. Every dependency has a pom.xml (in the repository with the dependency) which is used by Maven to find the dependencies relied upon by a dependency. These 'relied upon' dependencies are also known as transitive dependencies. They are included into your projects class path automatically. Now you can have any number of levels of transitive dependencies because Maven also will get the each transitive dependency's dependencies. In Maven this forms what is called the dependency graph. Yes this is the same graph I spoke of earlier. It looks like a tree, with your project as the root node, tree level one as the dependencies of your project and subsequent levels being transitive dependencies.

Now lets talk about the sticky points. If there are two or more versions of the same dependency in the graph of the dependencies for your project then Maven will need to make a choice as to which one to use. By default Maven will use the version that is the closest to the top node of the tree. If this version is older than the other version(s) you may have a problem. Likely a NoClassDef where a class from the new dependency cannot be found, because it does not exist in the older version. When you get this error it should be obvious which dependency the error is being caused by. The stack trace should give the full package path to the class. We can use this to figure out how to fix the problem.

At this time you probably could open a terminal session and cd to your local repository location (~/.m2/repository by default) and then cd into the dependency's folder in the repository. This can be helpful since it will allow you to see which versions of the dependency are being used and which classes are in each dependency. You could execute a "jar -tf {jarname} | grep {ClassName}" command on each jar to search it for the given class.

Let us discuss troubleshooting. First you will want to run the mvn dependency:resolve command. This will print out a listing of the resolved dependencies for your project. Resolved dependencies are those calculated by Maven via the dependency graph to be used in the project. You should be able to see which version of the troubled dependency is being used. I had you cd to the dependencies repo location. Check now which versions you have in your repository. You should be able to use the "jar -tf " command to check if the class causing the NoClassDef exists in one of the newer version jars (if you have newer version jars). Once you have found a jar that contains the class make note of the version.

Now for some debugging. This is where you get waist deep in it. We are going to use the -X option to Maven, which enables debugging. We also want to run just the compile phase which will give us all the info we need on the dependency graph. You might want to do something like "mvn -X compiler:compile > junk.txt" just so that you can use a text editor like Vi to search about the file. Now when you view this output you will be able to find a listing of the absolute classpath, including with jars are being used. You also will get a printout of the dependency graph including info on what ended up making it in. It is pretty easy to find the version of a dependency being used: find the one with the least amount of indent. If one or more versions share the least amount of indent then the newest version will be selected.

To get around version problems you can do a number of things. The easiest thing to do is define the newer version of the dependency directly in the pom.xml of the project. This will make sure that the newer version is used because it is 1) at the top of the dependency tree 2) the version is highest. You can also use excludes for a dependency definition in the pom.xml file for the old version of the dependency. Since the old version is excluded the newer (yet further down the graph) version will be used. When using excludes you will need to figure out which dependency in the pom.xml file has the transitive dependency to be excluded. Neither of these fixes are tough to perform, and will become easy to do in time.