简体   繁体   English

如何查找和替换 XML 中的属性值

[英]How to find and replace an attribute value in a XML

I am building a "XML scanner" in Java that finds attribute values starting with "!Here:".我正在用 Java 构建一个“XML 扫描器”,它可以找到以“!Here:”开头的属性值。 The attribute value contains instructions to replace later.属性值包含稍后替换的说明。 for example I have this xml file filled with records like例如我有这个 xml 文件充满了像

<bean value="!Here:Sring:HashKey"></bean>

How can I find and replace the attribute values only knowing it starts with "!Here:" ?只有知道它以"!Here:"开头,我才能找到和替换属性值?

In order to modify some element or attribute values in the XML file, while still being respectful of XML structure, you will need to use a XML parser. 为了修改XML文件中的某些元素或属性值,在仍然尊重XML结构的同时,您需要使用XML解析器。 It's a bit more involved than just String$replace() ... 它只涉及String$replace() ......

Given an example XML like: 给出一个示例XML:

<?xml version="1.0" encoding="UTF-8"?>
<beans> 
    <bean id="exampleBean" class="examples.ExampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="!Here:Integer:Foo"/>
    </bean>
    <bean id="anotherBean" class="examples.AnotherBean">
        <property name="stringProperty" value="!Here:String:Bar"/>
    </bean>
</beans>

In order to change the 2 markers !Here , you need 为了改变2个标记!Here ,你需要

  1. to load the file into a dom Document , 将文件加载到dom Document
  2. select with xpath the wanted nodes. 用xpath选择想要的节点。 Here I search for all nodes in the document with an attribute value that contains the string !Here . 在这里,我使用包含字符串的属性value搜索文档中的所有节点!Here The xpath expression is //*[contains(@value, '!Here')] . xpath表达式是//*[contains(@value, '!Here')]
  3. do the transformation you want on each selected nodes. 在每个选定的节点上进行所需的转换。 Here I just change !Here by What? 在这里,我只是改变!Here通过What? .

  4. save the modified dom Document into a new file. 将修改后的dom Document保存到新文件中。


static String inputFile = "./beans.xml";
static String outputFile = "./beans_new.xml";

// 1- Build the doc from the XML file
Document doc = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().parse(new InputSource(inputFile));

// 2- Locate the node(s) with xpath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xpath.evaluate("//*[contains(@value, '!Here')]",
                                          doc, XPathConstants.NODESET);

// 3- Make the change on the selected nodes
for (int idx = 0; idx < nodes.getLength(); idx++) {
    Node value = nodes.item(idx).getAttributes().getNamedItem("value");
    String val = value.getNodeValue();
    value.setNodeValue(val.replaceAll("!Here", "What?"));
}

// 4- Save the result to a new XML doc
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File(outputFile)));

The resulting XML file is: 生成的XML文件是:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<beans> 
    <bean class="examples.ExampleBean" id="exampleBean">
        <!-- setter injection using -->
        <property name="beanTwo" ref="anotherBean"/>
        <property name="integerProperty" value="What?:Integer:Foo"/>
    </bean>
    <bean class="examples.AnotherBean" id="anotherBean">
        <property name="stringProperty" value="What?:String:Bar"/>
    </bean>
</beans>

We have some alternatives to this in Java. 我们在Java中有一些替代方案。

  • First, JAXP (it has been bundled with Java since version 1.4). 首先, JAXP (自1.4版以来它已与Java捆绑在一起)。

Let's assume we need to change the attribute customer to false in this XML: 假设我们需要在此XML中将属性customer更改为false

<?xml version="1.0" encoding="UTF-8"?>
<notification id="5">
   <to customer="true">john@email.com</to>
   <from>mary@email.com</from>
</notification>

With JAXP (this implementation is based in @t-gounelle sample) we could do this: 使用JAXP(此实现基于@ t-gounelle示例),我们可以这样做:

//Load the document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Document input = factory.newDocumentBuilder().parse(resourcePath);
//Select the node(s) with XPath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate(String.format("//*[contains(@%s, '%s')]", attribute, oldValue), input, XPathConstants.NODESET);
// Updated the selected nodes (here, we use the Stream API, but we can use a for loop too)
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.item(i))
    .forEach(value -> value.setAttribute(attribute, newValue));
// Get the result as a String
TransformerFactory factory = TransformerFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer xformer = factory.newTransformer();
xformer.setOutputProperty(OutputKeys.INDENT, "yes");
Writer output = new StringWriter();
xformer.transform(new DOMSource(input), new StreamResult(output));
String result = output.toString();

Note that in order to disable external entity processing ( XXE ) for the DocumentBuilderFactory class, we configure the XMLConstants.FEATURE_SECURE_PROCESSING feature . 请注意,为了禁用DocumentBuilderFactory类的外部实体处理( XXE ),我们配置XMLConstants.FEATURE_SECURE_PROCESSING功能 It's a good practice to configure it when we parse untrusted XML files. 在解析不受信任的XML文件时配置它是一个很好的做法。 Check this OWASP guide with additional information. 查看此OWASP指南以及其他信息。

  • Another alternative is dom4j . 另一种选择是dom4j It's an open-source framework for processing XML which is integrated with XPath and fully supports DOM, SAX, JAXP and the Java platform such as Java Collections. 它是一个用于处理XML的开源框架,它与XPath集成,完全支持DOM,SAX,JAXP和Java平台等Java平台。

We need to add the following dependencies to our pom.xml to use it: 我们需要将以下依赖项添加到我们的pom.xml中以使用它:

<dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.1</version>
</dependency>
<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.2.0</version>
</dependency>

The implementation is very similar to JAXP equivalent: 该实现与JAXP等效非常相似:

// Load the document
SAXReader xmlReader = new SAXReader();
Document input = xmlReader.read(resourcePath);
// Features to prevent XXE
xmlReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
xmlReader.setFeature("http://xml.org/sax/features/external-general-entities", false);
xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Select the nodes
String expr = String.format("//*[contains(@%s, '%s')]", attribute, oldValue);
XPath xpath = DocumentHelper.createXPath(expr);
List<Node> nodes = xpath.selectNodes(input);
// Updated the selected nodes
IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.get(i);)
    .forEach(value -> value.addAttribute(attribute, newValue));
// We can get the representation as String in the same way as the previous JAXP snippet.

Note that with this method despite the name, if an attribute already exists for the given name it will be replaced otherwise it will add it. 请注意,使用此方法,尽管名称,如果给定名称已存在属性,它将被替换,否则将添加它。 We can found the javadoc here . 我们可以在这里找到javadoc。

  • Another nice alternative is jOOX , this library inspires its API in jQuery . 另一个不错的选择是jOOX ,这个库在jQuery中激发了它的API。

We need to add the following dependencies to our pom.xml to use jOOX. 我们需要将以下依赖项添加到我们的pom.xml中以使用jOOX。

For use with Java 9+: 与Java 9+一起使用:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox</artifactId>
    <version>1.6.2</version>
</dependency>

For use with Java 6+: 用于Java 6+:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox-java-6</artifactId>
    <version>1.6.2</version>
</dependency>

We can implement our attribute changer like this: 我们可以像这样实现我们的属性转换器:

// Load the document
DocumentBuilder builder = JOOX.builder();
Document input = builder.parse(resourcePath);
Match $ = $(input);
// Select the nodes
$
    .find("to") // We can use and XPATH expresion too.
    .get() 
    .stream()
    .forEach(e -> e.setAttribute(attribute, newValue));
// Get the String reprentation
$.toString();

As we can see in this sample, the syntaxis is less verbose than JAXP and dom4j samples. 正如我们在此示例中看到的,语法比JAXP和dom4j样本更简洁。

I compared the 3 implementations with JMH and I got the following results: 我将3个实现与JMH进行了比较,得到了以下结果:

| Benchmark                          Mode  Cnt  Score   Error  Units |
|--------------------------------------------------------------------|
| AttributeBenchMark.dom4jBenchmark  avgt    5  0.167 ± 0.050  ms/op |
| AttributeBenchMark.jaxpBenchmark   avgt    5  0.185 ± 0.047  ms/op |
| AttributeBenchMark.jooxBenchmark   avgt    5  0.307 ± 0.110  ms/op |

I put the examples here if you need to take a look. 如果你需要看一下,我把这些例子放在这里

Gounelle's answer is correct, however, it is based on fact that you know attribute name in advance. Gounelle的答案是正确的,但是,它是基于事先知道属性名称的事实。

If you want to find all attributes based only on their value, use this expression for xpath: 如果要仅根据其值查找所有属性,请将此表达式用于xpath:

NodeList attributes = (NodeList) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODESET
)

Here, you select all attributes by setting //*/@* . 在这里,您可以通过设置//*/@*选择所有属性。 Then you can set a condition like I mentioned above. 然后你可以设置一个像我上面提到的条件。

By the way, if you search for a single attribute, you can use Attr instead of Node 顺便说一下,如果搜索单个属性,则可以使用Attr而不是Node

Attr attribute = (Attr) xpath.evaluate(
    "//*/@*[contains(. , '!Here')]",
     doc, 
    XPathConstants.NODE
)

attribute.setValue("What!");

If you want to find attributes by particular value, use 如果要按特定值查找属性,请使用

"//*/@*[ . = '!Here:String:HashKey' ]"

If you search for attribute using number comparison, for instance, if you had 如果您使用数字比较搜索属性,例如,如果您有

<bean value="999"></bean>
<bean value="1337"></bean>

then you could select second bean by setting expression to 然后你可以通过设置表达式来选择第二个bean

"//*/@*[ . > 1000]"

I recently encountered a similar problem in the current project.我最近在当前项目中遇到了类似的问题。 I realize this solution might not solve the original question since it does not take into account the part about我意识到这个解决方案可能无法解决最初的问题,因为它没有考虑到关于

The attribute value contains instructions to replace later属性值包含以后替换的说明

Still, someone might find it useful.不过,有人可能会发现它很有用。 We already used StringSubstitutor.java from apache commons for replacing values in JSON files.我们已经使用 apache commons 中的StringSubstitutor.java来替换 JSON 文件中的值。

Turns out it worked just as well with XML text in our case.事实证明,在我们的案例中,它与 XML 文本一样有效。 It does operate on String's, which may not be suitable in all cases.它确实对字符串进行操作,这可能不适用于所有情况。

Given a simple XML like this:给定一个像这样的简单 XML:

<?xml version="1.0" encoding="UTF-8"?>
<Foo>
    <Bar>${replaceThis:-defaultValueHere}</Bar>
    <bean value="${!Here}:Sring:HashKey"></bean>
</Foo>

StringSubstitutor lets you replace the ${replaceThis:-defaultValueHere} with anything. StringSubstitutor允许你用任何东西替换${replaceThis:-defaultValueHere} In Java 11 simple example might look like this:在 Java 11 中,简单示例可能如下所示:

// Read the file as a string. (Java 11+)
String xml = Files.readString(path, StandardCharsets.US_ASCII);

// Specify what to replace 
Map<String, String> replacementMappings = Map.of(
    "replaceThis", "Something else",
    "!Here","Bean"
);

String xmlWithStringsReplaced = new StringSubstitutor(replacementMappings).replace(testFile);

Then the xmlWithStringsReplaced should look like:然后xmlWithStringsReplaced应如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<Foo>
    <Bar>Something Else</Bar>
    <bean value="Bean:Sring:HashKey"></bean>
</Foo>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM