How to update XML file without reformatting it

Recently, I needed to modify content of a few thousands XML files. The modification itself was nontrivial and consisted of reading two attributes from an element and then updating a value of a previously read attribute. And to make it even more interesting, I wanted to keep the formatting of all files intact. My goal was creating a reusable utility that could be used also in the future, should such a need rise again.

Because of the need to preserve the formatting, I rejected all tools I usually use (Dom4j, SAX, Jaxb) and started to search for a suitable tool. Finally I found the answer in a form of a recommendation on, the tool recommended was VTD-XML (or on github).

The home page looked promising and it's advantage list is probably written by Horst Fuchs. I highly recommend reading it. But I cared more about the actual usage of the tool. The most important part of the page for me was "Code samples".

You can download the library directly, or if you are using maven, just add the dependency.
The first part of the code is quite simple, you need to create instances of several key classes and link them together.
import com.ximpleware.*;

VTDGen vg = new VTDGen(); // base instance
vg.parseFile("path_to_file", true); // file load
VTDNav vn = vg.getNav(); // navigator
XMLModifier xm = new XMLModifier(vn); // object handling modifications
AutoPilot ap = new AutoPilot(vn); // object handling search
Once you acquire those instances, you can process the file.
ap.selectElement("element"); // element we are searching
while (ap.iterate()) { // loop for each "element" found
   int idLoc = vn.getAttrVal("id"); // find value of attribute id
   int controlLoc = vn.getAttrVal("control"); // find value of control
   String id = idLoc != -1 ? vn.toString(idLoc) : null; // read id
   String control = controlLoc != -1 ? vn.toString(controlLoc) : null;

   if (id == null || control == null) {
      // not the element we are looking for
   // for demonstration purpose, extend control by id
   xm.updateToken(controlLoc, control + id);
// finally save the modified file
xm.output(new FileOutputStream("path_to_new_file"));
What you can see is that there are no placeholder objects present. The current element is completely hidden. The attributes are represented only by their locations and if you want to read the actual value, you need to make another call of the navigator.

This code snippet can be used on a file like this.
<?xml version='1.0' encoding='UTF-8'?>
<dataset xmlns:xsi="" xsi:noNamespaceSchemaLocation="../../dataset.xsd">
   <element id="8514903" version="1" control="123"/>
   <element id="8514905" version="1" />


While the code of this library is not as easy to read as the code using Dom4j, it does have other features. If you are looking for a library that offers speed, low memory footprint, or you just need to update XML file a little, without reformatting them. This library might be exactly what you are looking for.


Popular posts from this blog

Ldap security for Jenkins CI

Automatic jsp recompile on Jboss AS 7

Simple EJB Arquillian test based on JUnit running on managed JBoss AS 7