简体   繁体   中英

How to find and replace an attribute value in a XML

I am building a "XML scanner" in Java that finds attribute values starting with "!Here:". The attribute value contains instructions to replace later. for example I have this xml file filled with records like

<bean value="!Here:Sring:HashKey"></bean>

How can I find and replace the attribute values only knowing it starts with "!Here:" ?

In order to modify some element or attribute values in the XML file, while still being respectful of XML structure, you will need to use a XML parser. It's a bit more involved than just String$replace() ...

Given an example XML like:

<?xml version="1.0" encoding="UTF-8"?>
<beans> 
    <bean id="exampleBean" class="examples.ExampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="!Here:Integer:Foo"/>
    </bean>
    <bean id="anotherBean" class="examples.AnotherBean">
        <property name="stringProperty" value="!Here:String:Bar"/>
    </bean>
</beans>

In order to change the 2 markers !Here , you need

  1. to load the file into a dom Document ,
  2. select with xpath the wanted nodes. Here I search for all nodes in the document with an attribute value that contains the string !Here . The xpath expression is //*[contains(@value, '!Here')] .
  3. do the transformation you want on each selected nodes. Here I just change !Here by What? .

  4. save the modified dom Document into a new file.


static String inputFile = "./beans.xml";
static String outputFile = "./beans_new.xml";

// 1- Build the doc from the XML file
Document doc = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().parse(new InputSource(inputFile));

// 2- Locate the node(s) with xpath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xpath.evaluate("//*[contains(@value, '!Here')]",
                                          doc, XPathConstants.NODESET);

// 3- Make the change on the selected nodes
for (int idx = 0; idx < nodes.getLength(); idx++) {
    Node value = nodes.item(idx).getAttributes().getNamedItem("value");
    String val = value.getNodeValue();
    value.setNodeValue(val.replaceAll("!Here", "What?"));
}

// 4- Save the result to a new XML doc
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File(outputFile)));

The resulting XML file is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<beans> 
    <bean class="examples.ExampleBean" id="exampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="What?:Integer:Foo"/>
    </bean>
    <bean class="examples.AnotherBean" id="anotherBean">
        <property name="stringProperty" value="What?:String:Bar"/>
    </bean>
</beans>

We have some alternatives to this in Java.

  • First, JAXP (it has been bundled with Java since version 1.4).

Let's assume we need to change the attribute customer to false in this XML:

<?xml version="1.0" encoding="UTF-8"?>
<notification id="5">
   <to customer="true">john@email.com</to>
   <from>mary@email.com</from>
</notification>

With JAXP (this implementation is based in @t-gounelle sample) we could do this:

//Load the document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Document input = factory.newDocumentBuilder().parse(resourcePath);
//Select the node(s) with XPath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(String.format("//*[contains(@%s, '%s')]", attribute, oldValue), input, XPathConstants.NODESET);
// Updated the selected nodes (here, we use the Stream API, but we can use a for loop too)
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.item(i))
    .forEach(value -> value.setAttribute(attribute, newValue));
// Get the result as a String
TransformerFactory factory = TransformerFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer xformer = factory.newTransformer();
xformer.setOutputProperty(OutputKeys.INDENT, "yes");
Writer output = new StringWriter();
xformer.transform(new DOMSource(input), new StreamResult(output));
String result = output.toString();

Note that in order to disable external entity processing ( XXE ) for the DocumentBuilderFactory class, we configure the XMLConstants.FEATURE_SECURE_PROCESSING feature . It's a good practice to configure it when we parse untrusted XML files. Check this OWASP guide with additional information.

  • Another alternative is dom4j . It's an open-source framework for processing XML which is integrated with XPath and fully supports DOM, SAX, JAXP and the Java platform such as Java Collections.

We need to add the following dependencies to our pom.xml to use it:

<dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.1</version>
</dependency>
<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.2.0</version>
</dependency>

The implementation is very similar to JAXP equivalent:

// Load the document
SAXReader xmlReader = new SAXReader();
Document input = xmlReader.read(resourcePath);
// Features to prevent XXE
xmlReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
xmlReader.setFeature("http://xml.org/sax/features/external-general-entities", false);
xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Select the nodes
String expr = String.format("//*[contains(@%s, '%s')]", attribute, oldValue);
XPath xpath = DocumentHelper.createXPath(expr);
List<Node> nodes = xpath.selectNodes(input);
// Updated the selected nodes
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.get(i);)
    .forEach(value -> value.addAttribute(attribute, newValue));
// We can get the representation as String in the same way as the previous JAXP snippet.

Note that with this method despite the name, if an attribute already exists for the given name it will be replaced otherwise it will add it. We can found the javadoc here .

  • Another nice alternative is jOOX , this library inspires its API in jQuery .

We need to add the following dependencies to our pom.xml to use jOOX.

For use with Java 9+:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox</artifactId>
    <version>1.6.2</version>
</dependency>

For use with Java 6+:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox-java-6</artifactId>
    <version>1.6.2</version>
</dependency>

We can implement our attribute changer like this:

// Load the document
DocumentBuilder builder = JOOX.builder();
Document input = builder.parse(resourcePath);
Match $ = $(input);
// Select the nodes
$
    .find("to") // We can use and XPATH expresion too.
    .get() 
    .stream()
    .forEach(e -> e.setAttribute(attribute, newValue));
// Get the String reprentation
$.toString();

As we can see in this sample, the syntaxis is less verbose than JAXP and dom4j samples.

I compared the 3 implementations with JMH and I got the following results:

| Benchmark                          Mode  Cnt  Score   Error  Units |
|--------------------------------------------------------------------|
| AttributeBenchMark.dom4jBenchmark  avgt    5  0.167 ± 0.050  ms/op |
| AttributeBenchMark.jaxpBenchmark   avgt    5  0.185 ± 0.047  ms/op |
| AttributeBenchMark.jooxBenchmark   avgt    5  0.307 ± 0.110  ms/op |

I put the examples here if you need to take a look.

Gounelle's answer is correct, however, it is based on fact that you know attribute name in advance.

If you want to find all attributes based only on their value, use this expression for xpath:

NodeList attributes = (NodeList) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODESET
)

Here, you select all attributes by setting //*/@* . Then you can set a condition like I mentioned above.

By the way, if you search for a single attribute, you can use Attr instead of Node

Attr attribute = (Attr) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODE
)

attribute.setValue("What!");

If you want to find attributes by particular value, use

"//*/@*[ . = '!Here:String:HashKey' ]"

If you search for attribute using number comparison, for instance, if you had

<bean value="999"></bean>
<bean value="1337"></bean>

then you could select second bean by setting expression to

"//*/@*[ . > 1000]"

I recently encountered a similar problem in the current project. I realize this solution might not solve the original question since it does not take into account the part about

The attribute value contains instructions to replace later

Still, someone might find it useful. We already used StringSubstitutor.java from apache commons for replacing values in JSON files.

Turns out it worked just as well with XML text in our case. It does operate on String's, which may not be suitable in all cases.

Given a simple XML like this:

<?xml version="1.0" encoding="UTF-8"?>
<Foo>
    <Bar>${replaceThis:-defaultValueHere}</Bar>
    <bean value="${!Here}:Sring:HashKey"></bean>
</Foo>

StringSubstitutor lets you replace the ${replaceThis:-defaultValueHere} with anything. In Java 11 simple example might look like this:

// Read the file as a string. (Java 11+)
String xml = Files.readString(path, StandardCharsets.US_ASCII);

// Specify what to replace 
Map<String, String> replacementMappings = Map.of(
    "replaceThis", "Something else",
    "!Here","Bean"
);

String xmlWithStringsReplaced = new StringSubstitutor(replacementMappings).replace(testFile);

Then the xmlWithStringsReplaced should look like:

<?xml version="1.0" encoding="UTF-8"?>
<Foo>
    <Bar>Something Else</Bar>
    <bean value="Bean:Sring:HashKey"></bean>
</Foo>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM