When we was studying our application to parse some XML, with have seen there three methods to do it:
- First one DOM with JDOM or DOM4J which parse XML into Element or Attribut object
- SAX with Xerces which parse XML without instantciate object
- And StAX which is similare as SAX but with better performance.
Why StAX is more performant ? Because It reading stream (outputstream or writter) by pull method. And Sax is by pushing. That made it more performant and it is good.
How to use StAX. Xerces and BEA make an implementation. But the one seems to be Woodstox. This version is very good and better in performance. Ho Yes, StAX is a specification. May you look it on google and if you found it let a comment ,).
Ok let’s check how to read a XML. The StAX work with sort event like if you were reading the flow. Things like : Starting Element, Reading Text, Ending Element.
When the event is a starting event then you can acess to attributes.
XMLInputFactory2 factory = (XMLInputFactory2) XMLInputFactory2.newInstance();
// Tel factory do right things
factory.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES,
Boolean.FALSE);
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES,
Boolean.FALSE);
factory.setProperty(XMLInputFactory.IS_COALESCING,
Boolean.FALSE);
// optimize all for speed
// you could optimize for memory usage or for xml compliance
factory.configureForSpeed();
// Building a reader and use it like an Iterator
XMLStreamReader2 reader = (XMLStreamReader2) factory.createXMLStreamReader(
"fichier.xml", new FileInputStream("fichier.xml"));
int eventType = xmlr.getEventType();
while(reader.hasNext())
{
eventType = reader.next();
switch (eventType)
{
case XMLEvent.START_ELEMENT:
System.out.println ("Start Element : " + reader.getName().toString() );
// You can access attribute here getAttributeValue or getAttributeName
break;
case XMLEvent.END_ELEMENT:
System.out.println ("End Element : " + reader.getName().toString() );
break;
// Other event type
/*
* START_DOCUMENT
* END_DOCUMENT
* START_ELEMENT
* END_ELEMENT
* ENTITY_DECLARATION
* ENTITY_REFRERENCE
* NAMESPACE
* CDATA
* CHARACTERS
* COMMENTS
* ATTRIBUTE
* PROCESSING_INSTRUCTION
*/
}
} |
And now how to write XML, let’s just check code it is simple:
// Encore une factory
XMLInputFactory2 factory = (XMLInputFactory2) XMLInputFactory2.newInstance();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
String encStr = "UTF-8";
sw = (XMLStreamWriter2) f.createXMLStreamWriter(bos, encStr);
sw.writeStartDocument();
sw.writeStartElement("root");
sw.writeAttribute("name","value");
sw.writeEndElement();
sw.writeEndDocument();
// And close
sw.close();
bos.close(); |
Now you are greater than ever!
A litle tips for using Woodstox on Weblogic server put this line into weblogic.xml:
<container-descriptor>
<prefer-web-inf-classes>true</prefer-web-inf-classes>
</container-descriptor> |
Yes because these servers got there own implementation which could raise some casting errors with my code.