简体   繁体   English

java - 分析大文本文件

[英]java - analyzing big text files

I need to analyze a log file at runtime with Java.我需要在运行时使用 Java 分析日志文件。

What I need is, to be able to take a big text file, and search for a certain string or regex within a certain range of lines.我需要的是,能够获取一个大文本文件,并在特定行范围内搜索特定字符串或正则表达式。

The range itself is deduced by another search.范围本身是由另一个搜索推导出来的。

For example, I want to search the string "operation ended with failure" in the file, but not the whole file, only starting with the line which says "starting operation" .例如,我想在文件中搜索字符串"operation ended with failure" ,而不是整个文件,只从"starting operation"的行开始。

Of course I can do this with plain InputStream and file reading, but is there a library or a tool that will help do it more conveniently?当然我可以用普通的InputStream和文件读取来做到这一点,但是有没有一个库或工具可以帮助更方便地做到这一点?

If the file is really huge, then in your case either good written java or any *nix tool solution will be almost equally slow (it will be bound to IO).如果文件真的很大,那么在你的情况下,写得好的 java 或任何 *nix 工具解决方案几乎同样慢(它将绑定到 IO)。 In such a case you won't avoid reading the whole file line-by-line.... And in this case few lines of java code would do the job... But rather than once-off search I'd think about splitting the file at generation time, which might be much more efficient.在这种情况下,您将无法避免逐行读取整个文件......在这种情况下,几行 java 代码就可以完成这项工作......但我会考虑而不是一次性搜索在生成时拆分文件,这可能会更有效。 You could redirect the log file to another program/script (either awk or python would be perfect for it) and split the file on-line/when generated rather than post-factum.您可以将日志文件重定向到另一个程序/脚本(awk 或 python 非常适合它)并在线/在生成时而不是事后拆分文件。

Check this one out - http://johannburkard.de/software/stringsearch/看看这个 - http://johannburkard.de/software/stringsearch/

Hope that helps;)希望有帮助;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM