简体   繁体   English

如何在Groovy中解析XML注释?

[英]How to parse XML comments in Groovy?

Is there any way to parse XML comments in Groovy? 在Groovy中有什么方法可以解析XML注释?

Both XMLParser and XMLSluprer don't seem to support comments nodes. XMLParser和XMLSluprer似乎都不支持注释节点。

Suppose following file (example.html): 假设以下文件(example.html):

 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>title</title> <body> <table cellpadding="1" cellspacing="1" border="1"> <thead> <tr><td rowspan="1" colspan="3">title</td></tr> </thead><tbody> <!--I cannot be seen--> <tr> <td>x</td> <td>x</td> <td>x</td> </tr> </tbody></table> </body> </html> 

Here is my code: 这是我的代码:

def parser = new XmlSlurper(false, false)
parser.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)

def response = parser.parse('example.html')

And when I use 当我使用

println XmlUtil.serialize(response)

to output the file, no comment can be seen. 输出文件,看不到注释。

as soon as you have html - it's possible to use jsoup to parse 一旦有了html,就可以使用jsoup进行解析

@Grab(group='org.jsoup', module='jsoup', version='1.11.3')
import org.jsoup.Jsoup
import org.jsoup.nodes.Document

def html = '''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>title</title>
<body>
<table cellpadding="1" cellspacing="1" border="1">
<thead>
<tr><td rowspan="1" colspan="3">title</td></tr>
</thead><tbody>
<!--I cannot be seen-->
<tr>
    <td>x</td>
    <td>x</td>
    <td>x</td>
</tr>
</tbody></table>
</body>
</html>'''

Document doc = Jsoup.parse(html)
println doc.select('html body table tbody').first()?.childNodes()?.find{it.nodeName()=='#comment'}?.getData()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM