简体   繁体   中英

How to avoid extra blank lines in XML generation with Java?

Currently I'm trying to develop some code with Java 9 and javax.xml libraries (both mandatory for my task) that edits an XML file and I'm having some weird issues adding child nodes.

This is the XML file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>

and I want to edit it build something like this:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
    <user>
        <name>A name</name>
        <last-name>Last Name</last-name>
        <username>username</username>
    </user>
</users>

Now, the first run of the code adds a single blank line before the <user> node. When it runs for a second time fills with more blank lines:

<users>


    <user>

        <name>name</name>

        <last-name>lastname</last-name>

        <username>username</username>

    </user>

    <user>
        <name>name</name>
        <last-name>lastname</last-name>
        <username>username</username>
    </user>
</users>

This is the XML generated after running the program 2 times. As you can see, it adds blank lines before the <user> nodes and between the other nodes, exactly n-1 blank lines between nodes being n the times the code was executed.

Wondering what is the content of those nodes before updating the file I wrote the next code:

int i=0;
while (root.getChildNodes().item(i)!=null){
  Node aux = root.getChildNodes().item(i);
  System.out.println("Node text content: ".concat(aux.getTextContent()));
  i++;
}

1st execution:

Node text content: 

Node text content: namelastnameusername

2nd execution:

Node text content: 


Node text content: 
        name
        lastname
        username

Node text content: 

Node text content: namelastnameusername

3rd execution

Node text content: 



Node text content: 

        name

        lastname

        username


Node text content: 


Node text content: 
        name
        lastname
        username

Node text content: 

Node text content: namelastnameusername

Finally, this is the Java code:

private static void saveUser(String firstName, String lastName, String username){
  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    try {
      DocumentBuilder builder = factory.newDocumentBuilder();
      Document doc = builder.parse(new File(databaseFile));
      Element root = doc.getDocumentElement();
      root.normalize();

      // build user node
      Element userNode = doc.createElement("user");
      Element nameNode =  doc.createElement("name");
      Element lastNameNode = doc.createElement("last-name");
      Element usernameNode = doc.createElement("username");

      //build structure
      nameNode.appendChild(doc.createTextNode(firstName));
      lastNameNode.appendChild(doc.createTextNode(lastName));
      usernameNode.appendChild(doc.createTextNode(username));

      userNode.appendChild(nameNode);
      userNode.appendChild(lastNameNode);
      userNode.appendChild(usernameNode);
      root.appendChild(userNode);

      //write the updated document to file or console
      TransformerFactory transformerFactory = TransformerFactory.newInstance();
      Transformer transformer = transformerFactory.newTransformer();
      DOMSource source = new DOMSource(doc);
      StreamResult result = new StreamResult(new File(databaseFile));
      transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
      transformer.setOutputProperty(OutputKeys.INDENT, "yes");
      transformer.transform(source, result);
    }catch (SAXException | ParserConfigurationException | IOException | TransformerException e1) {
      e1.printStackTrace();
    }
}

The only solution I could find is to delete blank lines after XML generation, but I think it's not a proper solution and I would like to find some alternatives first.

Any suggestions on how to tackle this problem?

I suspect it's the Transformer that's adding the blank lines.

Instead of using the default transformer ( transformerFactory.newTransformer() ), try passing in an XSLT that has xsl:strip-space set ( transformerFactory.newTransformer(new StreamSource(new File(PATH_TO_XSLT_FILE))); )...

XSLT File

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

In Short: Actually, in Java 9, you may only take the way of deleting the blank line after xml generated or after xml parsed from file, like:

private void clearBlankLine(Element element) {
    NodeList childNodes = element.getChildNodes();
    for (int index = 0; index < childNodes.getLength(); index++) {
        Node item = childNodes.item(index);
        if (item.getNodeType() != 1 && System.lineSeparator()
            .equals(item.getNodeValue())) {
            element.removeChild(item);
        } else {
            if (item instanceof Element) {
                clearBlankLine((Element) item);
            }
        }
    }
}

Then invoke this with root element.

Details:

In the flow of xml generation, there are three lifecycle for each element parse: startElement , parse , endElement . While the indent feature is implemented in the startElement scope. Also the indent will add a blank line in document.

The invoke stack is different in java 8 between java 9:

In Java 8: ToStream#startElement-> ToStream#indent(IfNecessary)

In Java 9: ToStream#startElement->ToStream#flushCharactersBuffer(IfNecessary)->ToStream#indent(IfNecessary)

While the flushCharactersBuffer also do indent when we open the indent feature like: transformer.setOutputProperty(OutputKeys.INDENT, "yes"); Also the condition to invoke method: flushCharactersBuffer and method: indent almost same.

That means in Java 9, this would add two new line for each need indented element, result to blank lines appeared.

your solution and below suggestion are both works fine for me, please try with this test case,

public static void main(String[] args) {

    saveUser("test one", "test two", "test three");

}

private static void saveUser(String firstName, String lastName, String username){
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    try {
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new File("second.xml"));
        Element root = doc.getDocumentElement();
        root.normalize();

        // build user node
        Element userNode = doc.createElement("user");
        Element nameNode =  doc.createElement("name");
        Element lastNameNode = doc.createElement("last-name");
        Element usernameNode = doc.createElement("username");

        userNode.appendChild(nameNode).setTextContent(firstName); //set the text content
        userNode.appendChild(lastNameNode).setTextContent(lastName);
        userNode.appendChild(usernameNode).setTextContent(username);
        root.appendChild(userNode);

        //write the updated document to file or console
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        DOMSource source = new DOMSource(doc);
        StreamResult result = new StreamResult(new File("second.xml"));
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.transform(source, result);

     }catch (Exception e) {
        e.printStackTrace();
     }
}

second.xml (before execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
</users>

second.xml (first execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

second.xml (second execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

second.xml (third execution)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
<user>
<name>test one</name>
<last-name>test two</last-name>
<username>test three</username>
</user>
</users>

importing classes,

import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.parsers.DocumentBuilder; // missing import class

import org.w3c.dom.Document;
import org.w3c.dom.Element;

I found this solution using an XPath to be much cleaner than any of the others here (h/t to Isaac for his answer over at https://stackoverflow.com/a/12670194/1339923 ). It doesn't require a separate (ie, XSLT) file, and doesn't require you to add 14 lines of Java to iterate over every node in the Document. Only 6 lines of code.

In the case of @pablo-r-grande's original question... right before this comment (ie, just before loading the Document into the DOMSource ):

//write the updated document to file or console

...I would add these lines:

// Generate list of all empty Nodes, them remove them
XPath xp = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xp.evaluate("//text()[normalize-space(.)='']", doc, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); ++i) { // note the position of the '++'
    Node node = nl.item(i);
    node.getParentNode().removeChild(node);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM