[英]Using SAX (Java) to parse multiple XML messages from a single TCP-stream
I'm in a position where I use Java to connect to a TCP port and am streamed XML documents one after another, each delimited with the <?xml
start of document tag. 我处于这样一个位置:我使用Java连接到TCP端口,并且一个接一个地流式传输XML文档,每个文档都用
<?xml
start of document标签分隔。 An example which demonstrates the format: 演示格式的示例:
<?xml version="1.0"?>
<person>
<name>Fred Bloggs</name>
</person>
<?xml version="1.0"?>
<person>
<name>Peter Jones</name>
</person>
I'm using the org.xml.sax.*
api. 我正在使用
org.xml.sax.*
api。 The SAX parsing works perfectly for the first document but throws an exception when it comes across the start of the second document: SAX解析适用于第一个文档,但在遇到第二个文档的开头时抛出异常:
Exception in thread "main" org.xml.sax.SAXParseException: The processing instruction
target matching "[xX][mM][lL]" is not allowed.
The following skeleton class demonstrates the setup I'm using: 以下框架类演示了我正在使用的设置:
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
import java.io.FileReader;
public class XMLTest extends DefaultHandler {
public XMLTest() {
super();
}
public static void main(String[] args) throws Exception {
XMLReader xr = XMLReaderFactory.createXMLReader();
XMLTest handler = new XMLTest();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
xr.parse(new InputSource(new Socket("127.0.0.1", 4555).getInputStream()));
}
}
I have no control over the format of the xml (it's a financial data feed), but I need to be able to parse it efficiently, and parse all the documents. 我无法控制xml的格式(它是一个财务数据源),但我需要能够有效地解析它,并解析所有文档。 I've spent the afternoon/evening trying different things but none have yielded results.
我花了整个下午/晚上尝试不同的事情,但没有一个产生结果。 Any help would be greatly appreciated.
任何帮助将不胜感激。
You'd like to split the stream on every <?xml version="1.0"?>
and parse them all separately. 您想在每个
<?xml version="1.0"?>
上拆分流并分别解析它们。 The BufferedReader
may be helpful in this. BufferedReader
可能对此有所帮助。 Kickoff example: 开球示例:
reader = new BufferedReader(new InputStreamReader(input, "UTF-8"));
StringBuilder builder = null;
for (String line; (line = reader.readLine()) != null;) {
if (line.startsWith("<?xml")) {
if (builder != null) {
xr.parse(new InputSource(builder.toString()));
}
builder = new StringBuilder();
}
builder.append(line);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.