Monday, January 31, 2011

How To use Dom4J XPath with XML Namespaces

Noting this because I can't for the life of me remember this; every time I need to do it I have to Google it. Perhaps because it's not tremendously intuitive, at least to my way of thinking.

Suppose we want to select the filter named terracotta from the XML below via something similar to //filter[filter-name/text()='terracotta'] (with commons-io on classpath for FileUtils if you want to do it *exactly* as shown):
<?xml version="1.0" encoding="UTF-8"?>

<web-app version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd">

 <context-param>
  <param-name>javax.servlet.jsp.jstl.fmt.localizationContext</param-name>
  <param-value>messages</param-value>
 </context-param>

 <servlet>
  <servlet-name>default</servlet-name>
  <servlet-class>org.mortbay.jetty.servlet.DefaultServlet</servlet-class>
  <init-param>
   <param-name>dirAllowed</param-name>
   <param-value>false</param-value>
  </init-param>

  <load-on-startup>0</load-on-startup>
 </servlet>
  
  <!-- filter for terracotta session mgmt. See http://www.terracotta.org/documentation/ga/web-sessions-install.html -->
  <filter> <!-- <== WE WANT TO SELECT THIS NODE -->
   <filter-name>terracotta</filter-name>
   <filter-class>org.terracotta.session.TerracottaJetty61xSessionFilter</filter-class>
   <init-param>
    <param-name>tcConfigUrl</param-name>
    <param-value>$TerracottaServerList</param-value>
   </init-param>
  </filter> 

  <!-- ...ommitted...-->
</web-app>  
Our first draft of the code might look something like this:
Document dom = DocumentHelper.parseText(FileUtils.readFileToString(webXmlFile));
  Node tcFilterCfg = dom.selectSingleNode("//filter[filter-name/text()='terracotta']");
Unfortunately this gets us a null as the document uses a default namespace (xmlns="http://java.sun.com/xml/ns/j2ee"). To select it properly we have to build an XPath object that knows about namespaces and then write our query to explicitly indicate which one we are looking for (yes, we could probably set a default but that's sloppy). The result is this nice, clean, intuitive block of garbage:
Document dom = DocumentHelper.parseText(FileUtils.readFileToString(webXmlFile));
  Map<String, String> namespaceUris = new HashMap<String, String>();
  namespaceUris.put("j2ee", "http://java.sun.com/xml/ns/j2ee");
  
  XPath xPath = DocumentHelper.createXPath("//j2ee:filter[j2ee:filter-name/text()='terracotta']");
  xPath.setNamespaceURIs(namespaceUris);
  
  Node tcFilterCfg = xPath.selectSingleNode(dom);

Note that we use the Map to tell the XPath that j2ee:something means the something in the xmlns for http://java.sun.com/xml/ns/j2ee.

The code winds up quite different (alias j2ee to such and such then select the filter from j2ee) than what my brain is thinking about doing (select that one!), which is probably why I can never quite remember this.

Boo-urns!

2 comments:

GoVeN said...

If you are struggling with xml namespaces, there is a great tutorial on xpath namespaces at xml reports. It walks you through it in very simple steps

xml reports

ESSAM said...

If you are struggling with xml namespaces, there is a great tutorial on xpath namespaces at xml reports. It walks you through it in very simple steps
xml reports http://www.xml-reports.com/2011/05/xml-namespaces-for-dummies-part-1.html

Post a Comment