简体   繁体   English

java.util.Scanner#findWithHorizon 在 32Mb 输入上抛出 OutOfMemory 异常 stream

[英]java.util.Scanner#findWithHorizon throws OutOfMemory exception on 32Mb input stream

Search string '4914904' exists at the tail of the stream.搜索字符串“4914904”存在于 stream 的尾部。

Here's the code这是代码

    Scanner sc = new Scanner(xmlInputStream, "UTF-8");
    if(sc.findWithinHorizon('4914904', 0) != null) { // <--- exception is thrown here
    }

Any suggestions would be highly appreciated.任何建议将不胜感激。

If you read the API for Scanner, you will see that if you pass the argument 0 to findWithinHorizon that it will read the entire buffer at once.如果您阅读 Scanner 的 API,您会看到如果您将参数 0 传递给findWithinHorizo n,它将立即读取整个缓冲区。

Since you don't do anything with the value from this I see a few options.由于您没有对由此产生的价值做任何事情,因此我看到了一些选择。

Try changing to useDelimiter(String pattern) and then call if(sc.hasNext()) which may help some with the memory footprint.尝试更改为useDelimiter(String pattern)然后调用if(sc.hasNext())这可能有助于 memory 足迹。

If you have XML, use an XML parser instead of a text scanner.如果您有 XML,请使用 XML 解析器而不是文本扫描仪。

You could consider writing a custom method which parses the input stream one line at a time and perform the search.您可以考虑编写一个自定义方法,一次解析输入 stream 并执行搜索。 That way you don't have to read in the full buffer.这样您就不必读取完整的缓冲区。

Increase the memory you give the jvm when it starts -Xmx256m增加 memory 你给的 jvm 启动时-Xmx256m

On a side note: Don't re-write the code when you post here.附带说明:在此处发布时不要重新编写代码。 Just copy and paste.只需复制和粘贴。

Use -Xmx256m when starting Java, it will give the JVM more heap space.启动 Java 时使用-Xmx256m ,它会给 JVM 更多的堆空间。

You also might want to consider using an XML parsing library instead of Scanner, if you're dealing with XML.如果您正在处理 XML,您可能还需要考虑使用 XML 解析库而不是 Scanner。 A streaming API like SAX or StAX are the best bet for a big input.像 SAX 或 StAX 这样的流式 API 是大输入的最佳选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM