简体   繁体   English

解析RSS时出错-> org.xml.sax.SAXParseException; lineNumber:1; columnNumber:1; 文件过早结束

[英]Error Parsing RSS -> org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Premature end of file

I have a method that parses RSS from differents url's and works great: 我有一种方法可以解析来自不同网址的RSS,并且效果很好:

For example: https://www.clarin.com/rss/lo-ultimo/ 例如: https//www.clarin.com/rss/lo-ultimo/

But in one of these url ( https://www.cio.com/category/mobile/index.rss ) and in all of the RSS of that web, when I execute the code, the console shows me the following error and the parser doesn't works: 但是在这些网址之一( https://www.cio.com/category/mobile/index.rss )和该网站的所有RSS中,当我执行代码时,控制台会向我显示以下错误,并且解析器不起作用:

org.xml.sax.SAXParseException; org.xml.sax.SAXParseException; lineNumber: 1; lineNumber:1; columnNumber: 1; columnNumber:1; Premature end of file. 文件过早结束。

I'am parsing the RSS feed's with this method (a part of the code): 我正在使用这种方法(代码的一部分)来解析RSS feed:

        try {
            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

            URL url = new URL("https://www.cio.com/category/mobile/index.rss");
            URLConnection urlConnection = url.openConnection();
            InputStream inputStream = urlConnection.getInputStream();

            Document doc = dBuilder.parse(inputStream);

The error happens in the last line -> Document doc = dBuilder.parse(inputStream); 错误发生在最后一行-> 文档doc = dBuilder.parse(inputStream);

In that code I'am parsing the RSS from the url, the strange thing is that when I parse the RSS directly from the file (index.rss) I have no errors and the parsing works great, I do this using: 在该代码中,我要从url解析RSS,奇怪的是,当我直接从文件(index.rss)解析RSS时,我没有错误,并且解析效果很好,我使用以下方法执行此操作:

File fXmlFile = new File("index.rss");

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();

DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

Document doc = dBuilder.parse(fXmlFile);

Document doc = dBuilder.parse(inputStream);

doc.getDocumentElement().normalize();

To notice: 通知:

  • This is a maven webapp project. 这是一个Maven Webapp项目。
  • Deployed in Tomcat 9.0 server. 部署在Tomcat 9.0服务器中。
  • The method run when I press a button in the web's main page. 当我按下网站主页上的按钮时,该方法就会运行。

I mention that because when I tried in a simple java project, the parser works fine with the inputStream too. 我提到这是因为当我在一个简单的Java项目中尝试时,解析器也可以与inputStream一起正常工作。

I would appreciate very much if you could help me with this, thanks! 如果您能帮助我,我将非常感谢,谢谢!

I've run the following code and it works fine without errors. 我运行了以下代码,它运行正常,没有错误。

     public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

        URL url = new URL("https://www.cio.com/category/mobile/index.rss");
        URLConnection urlConnection = url.openConnection();
        InputStream inputStream = urlConnection.getInputStream();

        Document doc = dBuilder.parse(inputStream);
        Element root = doc.getDocumentElement();
        NodeList children = root.getChildNodes();

        for (int i = 0; i < children.getLength(); i++) {
             System.out.println(children.item(i));
        }

        inputStream.close();

     }

Then I added the following and attempted to parse an empty file: 然后,我添加了以下内容并尝试解析一个空文件:

    File fXmlFile = new File("EmptyFile.xml");
    inputStream = new FileInputStream(fXmlFile);
    doc = dBuilder.parse(inputStream);
    System.out.println(doc.getDocumentElement());

When the file was empty (or just contained the XML processing instruction), I received the error you are receiving. 当文件为空(或仅包含XML处理指令)时,我收到了您收到的错误。 When I added a root element, the error disappeared. 当我添加根元素时,错误消失了。 This seems to me to prove that this error occurs when inputStream (or the thing it is streaming anyway) is essentially empty. 在我看来,这证明了当inputStream(或无论如何正在流传输的东西)本质上为空时,会发生此错误。 This theory also seems to be supported by: org.xml.sax.SAXParseException: Premature end of file for *VALID* XML . 似乎也支持此理论: org.xml.sax.SAXParseException:* VALID * XML文件的结尾过早 I would therefore suggest, if you're still receiving this error, to put a breakpoint on URL url... and follow it through to see if the connection is being made properly. 因此,如果您仍然收到此错误,我建议您在URL url上放置一个断点...,然后继续进行操作以查看连接是否正确。 Hope that helps. 希望能有所帮助。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 播放框架2.0:“ Errorjava.lang.RuntimeException:org.xml.sax.SAXParseException; lineNumber:1; columnNumber:1;文件的结尾过早 - Play Framework 2.0: "Errorjava.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Premature end of file 使用Java抛出org.xml.sax.SAXParseException来解析中文字符; lineNumber:1; columnNumber:1; 序言中不能有内容 - Parsing Chinese char using Java throwing org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog 解析SOAP XML响应,错误:org.xml.sax.SAXParseException; lineNumber:1; columnNumber:1; 序言中不能有内容 - Parse SOAP XML response, Error: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog org.xml.sax.SAXParseException:*VALID* XML 的文件过早结束 - org.xml.sax.SAXParseException: Premature end of file for *VALID* XML 文件过早结束。 org.xml.sax.SAXParseException; 文件过早结束。[致命错误]:-1:-1: - Premature end of file. org.xml.sax.SAXParseException; Premature end of file.[Fatal Error] :-1:-1: org.xml.sax.SAXParseException:过早的结束文件 - org.xml.sax.SAXParseException: Premature end file 轴:faultString:org.xml.sax.SAXParseException:文件过早结束 - Axis: faultString: org.xml.sax.SAXParseException: Premature end of file org.xml.sax.SAXParseException:带有jaxbUnmarshaller的文件过早结束 - org.xml.sax.SAXParseException: Premature end of file with jaxbUnmarshaller org.xml.sax.SAXParseException:文件过早结束 - org.xml.sax.SAXParseException: Premature end of file org.xml.sax.SAXParseException:文件过早结束 - org.xml.sax.SAXParseException: Premature end of file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM