[英]Getting error while parsing an XML 1.1 document with Stax parser
I am trying to parse Burp Suite XML export.我正在尝试解析 Burp Suite XML 导出。 I have used Stax parser and XPath parser.
我使用过 Stax 解析器和 XPath 解析器。 But I am getting
但我越来越
Location: /py/message/viewBill.pt [id parameter]]]></location>
<severity>High</severity>
<confidence>Certain</confidence>
<issueBackground><![CDATA[Reflected
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[66,2357]
Message: The element type "location" must be terminated by the matching end-tag "< /location>".
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:604)
at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
error all the time.一直出错。 Although there is an end-tag, parser cannot find it.
虽然有结束标记,但解析器找不到它。 My code is:
我的代码是:
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new StringReader(str));
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
switch (event.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
if (qName.equalsIgnoreCase(ISSUES)) {
issues = true;
} else if (qName.equalsIgnoreCase(ISSUE)) {
issue = true;
} else if (qName.equalsIgnoreCase(NAME)) {
name = true;
} else if (qName.equalsIgnoreCase(HOST)) {
host = true;
} else if (qName.equalsIgnoreCase(PATH)) {
path = true;
} else if (qName.equalsIgnoreCase(LOCATION)) {
location = true;
} else if (qName.equalsIgnoreCase(SEVERITY)) {
severity = true;
}
break;
case XMLStreamConstants.CHARACTERS:
Characters characters = event.asCharacters();
if (name) {
System.out.println("Name: " + characters.getData());
name = false;
} else if (host) {
System.out.println("Host: " + characters.getData());
host = false;
} else if (path) {
System.out.println("Path: " + characters.getData());
path = false;
} else if (location) {
System.out.println("Location: " + characters.getData());
location = false;
} else if (severity) {
System.out.println("severity: " + characters.getData());
severity = false;
}
break;
case XMLStreamConstants.END_ELEMENT:
EndElement endElement = event.asEndElement();
String endElementName = endElement.getName().getLocalPart();
if (endElementName.equalsIgnoreCase(ISSUE)) {
issue = false;
} else if (endElementName.equalsIgnoreCase(NAME)) {
name = false;
} else if (endElementName.equalsIgnoreCase(HOST)) {
host = false;
} else if (endElementName.equalsIgnoreCase(PATH)) {
path = false;
} else if (endElementName.equalsIgnoreCase(LOCATION)) {
location = false;
}
break;
}
}
And I am trying to parse the report that I found on https://github.com/mtesauro/parse-tools/blob/master/examples/brief-burp-export.xml .我正在尝试解析我在https://github.com/mtesauro/parse-tools/blob/master/examples/brief-burp-export.xml上找到的报告。
Can someone give some advice?有人可以给一些建议吗?
I would hazard a guess that it's a bug in the XML parser.我敢猜测这是 XML 解析器中的一个错误。 Specifically, I suspect it hasn't recognized
]]]>
on line 63 as terminating the CDATA section, so it carries on thinking it's in CDATA until the ]]>
at the end of line 66, at which point it found the end tag </issueBackground>
where it was looking for </location>
.具体来说,我怀疑它没有将第 63 行的
]]]>
识别为终止 CDATA 部分,因此它继续认为它在 CDATA 中,直到第 66 行末尾的]]>
,此时它找到了结束标记</issueBackground>
在哪里寻找</location>
。 Raise a ticket with the suppliers of the XML parser, or switch to one that works.向 XML 解析器的供应商提出一张票,或者切换到一个有效的。
I found some examples parse Burp Export with CSS.我发现了一些使用 CSS 解析 Burp 导出的示例。 Than I found Jsoup for CSS parsing in Java.
比我在 Java 中找到用于 CSS 解析的Jsoup 。 It is a bit complicated but works well.
这有点复杂,但效果很好。
Document document = Jsoup.parse(str);
Elements allElements = document.getAllElements();
for (Element element : allElements) {
String tagName = element.tagName();
String text = element.text();
if (tagName.equalsIgnoreCase("name")) {
System.out.println("name " + text);
} else if (tagName.equalsIgnoreCase("host")) {
System.out.println("host " + text);
System.out.println("ip " + element.attr("ip"));
}
}
I was also facing same issue.我也面临同样的问题。 After spending some time on searching online, I have found below solution
花了一些时间在网上搜索后,我找到了以下解决方案
Since the xml value has CDATA, event type will be XMLEvent.CDATA and not XMLEvent.CHARACTERS由于 xml 值具有 CDATA,事件类型将为 XMLEvent.CDATA 而不是 XMLEvent.CHARACTERS
Switch(reader.hasNext()) {
case TAG:
eventType = reader.next();
if (eventType == XMLEvent.CDATA || eventType == XMLEvent.CHARACTERS) {
System.out.println(reader.getText());
}
break;
........
}
Also I have added below dependency.我还添加了以下依赖项。 I am not sure how this dependency helping but without this dependency we will get same exception mentioned above.
我不确定这种依赖性有何帮助,但如果没有这种依赖性,我们将得到与上面提到的相同的异常。
But after adding this dependency issue got resolved.但添加此依赖项后问题得到解决。
<dependency>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
<version>5.0.0</version>
</dependency>
https://github.com/FasterXML/woodstox https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.0.0 https://github.com/FasterXML/woodstox https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.