简体   繁体   English

在XML文件中将双引号替换为“

[英]Replace double quote with " in XML file

I have a XML file which contain quotes as follows 我有一个包含引号的XML文件,如下所示

<feast key="NAME" value="NAME TEST 'xxxxx"yyyy' $"/>

I need to replace xxxxx"yyyy to xxxxx&quot;yyyy in all occurrence. 我需要更换xxxxx"yyyyxxxxx&quot;yyyy在所有发生。

Note: xxxxx and yyyy are defined by user. 注意:xxxxx和yyyy由用户定义。 So it can be of any form. 因此它可以是任何形式。

Here i included the sample XML and code to parse. 在这里,我包括了示例XML和要解析的代码。

TestSaxParse.xml TestSaxParse.xml

<?xml version="1.0" encoding="US-ASCII" ?> 
<TEST Office="TEST Office">
    <LINE key="112313133320">
        <TESTNO value="0"/>
        <FEATURE>
            <feast key="001" value="001"/>
            <feast key="NAME" value="NAME TEST 'xxxxx_&_yyyy' $"/>
        </FEATURE>
    </LINE>
    <LINE key="112313133321">
        <TESTNO value="0"/>
        <FEATURE>
            <feast key="002" value="002"/>
            <feast key="NAME" value="NAME TEST 'xxxxx"yyyy' $"/>
        </FEATURE>
    </LINE>
</TEST>

SaxParseEx.java SaxParseEx.java

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SaxParseEx extends DefaultHandler{

    private static String xmlFilePath = "/home/system/TestSAXParse.xml";

    public static void main(String[] args) {

        SaxParseEx SaxParseEx = new SaxParseEx();
        SAXParserFactory fact = SAXParserFactory.newInstance();
        SAXParser parser;
        try {

            Path path = Paths.get(xmlFilePath);
            Charset charset = StandardCharsets.UTF_8;
            String content = new String(Files.readAllBytes(path), charset);

            // replace & with &amp; 
            content = content.replaceAll( "(&(?!amp;))", "&amp;");
           // content = content.replaceAll( "(\"(?!quot;))", "&quot;"); Need regex to replace " with &quot; only on specific place where i mentioned above

            // Write updated content to XML file
            Files.write(path, content.getBytes(charset));

            // XML Parsing
            parser = fact.newSAXParser();
            parser.parse(new File(xmlFilePath), SaxParseEx);
            System.out.println("PARSE SUCCESS");
            return;
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println("PARSE FAILED");
    }
}

O/P O / P

org.xml.sax.SAXParseException; systemId: file:/home/system/TestSAXParse.xml; lineNumber: 14; columnNumber: 46; Element type "feast" must be followed by either attribute specifications, ">" or "/>".

I have replace all & with &amp; 我已将所有&替换为&amp; to fix the SAXParseException on Line No. 7. I cannot replace " with &quot; 修复第7行的SAXParseException。我无法将"替换&quot;

EDIT: 编辑:

I cannot use this answer . 我不能用这个答案 I'm looking for different solution because of 我正在寻找不同的解决方案,因为

  1. The XML file is large size ( > 100MB) XML档案较大(> 100MB)
  2. So i think it is not feasible to compile and replace every line within double quote values as suggested in the answer. 因此,我认为按照答案的建议编译和替换双引号内的每一行都是不可行的。
  3. I'm looking for replace all as like 我正在寻找像所有替换

content = content.replaceAll( "(&(?!amp;))", "&amp;");

Is there any possibility to write a regex like that? 是否有可能编写这样的正则表达式?

I replaced all " with &quot; when it is enclosed with ' . So i added below lines before to Files.write 我取代所有"&quot;当它被包围' 。因此,我加入以下行之前Files.write

Pattern pattern = Pattern.compile("'(.*[\"].*)'");
Matcher matcher = pattern.matcher(content);
while (matcher.find()) {
    content = content.replaceAll(matcher.group(1), matcher.group(1).replace("\"", "&quot;"));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM