简体   繁体   English

如何在java中执行replaceAll排除注释

[英]How to perform replaceAll excluding comments in java

I have a file, typically XML files. 我有一个文件,通常是XML文件。 I want to replace all the occurrences of 'xy' with 'p.q'. 我想用'p.q'替换所有出现的'xy'。 But during this replacement, i want to ignore the occurrences of xy in comments (). 但在替换期间,我想忽略comments()中xy的出现。

I was trying to use String.replaceAll() to perform this task. 我试图使用String.replaceAll()来执行此任务。

For Example : 例如 :

<?xml version="1.0" encoding="UTF-8"?>
<name>This occurrence of x.y should be replaced</name>
<!-- This occurrence of x.y should not be replaced -->

I tried using String.replaceAll("x[\\.]y", "pq") but i could see that occurrences in comments are also getting replaced 我尝试使用String.replaceAll(“x [\\。] y”,“pq”)但我可以看到评论中的事件也被取代

I could use an other alternative by which i can read the file line by line and exclude the lines that starts with comments, but i am interested in using replaceAll() 我可以使用另一种替代方法,我可以逐行读取文件并排除以注释开头的行,但我有兴趣使用replaceAll()

Please provide a way by which this can be achieved. 请提供一种方法来实现这一目标。

Although this isn't strictly the answer you are looking for, I have a recommendation. 虽然这不是你想要的答案,但我有一个建议。

I'd recommend using a proper XML parser like Java DOM to check and replace text in your nodes, rather than dealing with your XML as a raw String . 我建议使用适当的XML解析器(如Java DOM)来检查和替换节点中的文本,而不是将XML作为原始String Something like this should replace the corresponding text in your node if they are not a comment. 如果它们不是注释,这样的东西应该替换节点中的相应文本。

File f = new File("your.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(f);

NodeList eList = doc.getElementsByTagName("*");
for (int e = 0; e < eList.getLength(); e++) {
    Node element = eList.item(e);
    NodeList nList = element.getChildNodes();
    for(int n = 0; n < nList.getLength(); n++){
        Node node = nList.item(n);
        if(node.getNodeType()==Node.TEXT_NODE){
            node.setNodeValue(node.getNodeValue().replace("x.y", "p.q")); 
        }
    }
}

If memory/efficiency are an issue (like when your.xml is huge), you would be better off using SAX , which is faster (a little more code intensive) and doesn't store the XML in memory. 如果内存/效率是一个问题(比如当你的文件很大时),那么最好使用SAX ,它更快(代码密集程度更高)并且不会将XML存储在内存中。

Once your Document has been edited you'll probably want to use a Transformer to create a suitable output. 编辑完Document您可能希望使用Transformer来创建合适的输出。 ( Official guide here , curtsey of Boris the Spider's comment) 官方指南 ,鲍里斯的蜘蛛评论)

Hope this helps. 希望这可以帮助。

Further Reading; 进一步阅读;

If using regex, an option would be to use lookarounds for checking to replace only outside comments: 如果使用正则表达式,一个选项是使用lookarounds检查以仅替换外部注释:

(?s)x\.y(?!(?:(?!<!--).)+-->)

As a Java string: 作为Java字符串:

"(?s)x\\.y(?!(?:(?!<!--).)+-->)"

Used the (?s) DOTALL modifier for making the . 使用(?s) DOTALL 修改器来制作. also match newlines. 也匹配换行符。

Test at regexplanet (click on Java ) 在regexplanet测试 (点击Java

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM