简体   繁体   English

任何其他解析器而不是 Jsoup?

[英]Any other parser instead of Jsoup?

Which small , light parser is it better to use if Jsoup.parse in my case just crashes because of the file size如果在我的情况下 Jsoup.parse 因文件大小而崩溃,则最好使用哪个的解析器

my code is not important in here, but here:我的代码在这里并不重要,但在这里:

            Document doc = Jsoup.parse(html);

            Element table = doc.getElementsByTag("table");
            return table;

OK, this actually works, but there is a difference either if I run this code on the PC(dalvik virtual machine) or on the android device(developing for android OS).好的,这确实有效,但是如果我在 PC(dalvik 虚拟机)或 android 设备(为 android OS 开发)上运行此代码,则会有所不同。 I am not sure what exactly is the problem, but it seems that the memory(heap size) is bigger on the device, but I have not checked it so far - it is just an assumption.我不确定到底是什么问题,但似乎设备上的内存(堆大小)更大,但到目前为止我还没有检查过 - 这只是一个假设。 So what I can say about Jsoup - it is i guess the fastest and smallest library which is suitable for my case to parse and clean pure HTML code on the DOM basis.所以我可以对 Jsoup 说些什么——我猜它是最快和最小的库,适合我的情况,在 DOM 基础上解析和清理纯 HTML 代码。 If you one needs to extarc some part of the HTML based on the tags(tr, table, ... etc) then Jsoup is the best possible open source HTML parser.如果您需要根据标签(tr、table、...等)提取 HTML 的某些部分,那么 Jsoup 是最好的开源 HTML 解析器。 And when applying it in the code, there are only two lines that are needed, as it is shown in the example above.而在代码中应用它时,只需要两行,如上面的示例所示。 The result you get after getting elemnets or extracting some part of the HTML is a simple String which contains the tags that you selected with the Jsoup.在获取 elemnets 或提取 HTML 的某些部分后得到的结果是一个简单的字符串,其中包含您使用 Jsoup 选择的标签。 I am sure it has more functionality than that, just never have used anything more complex.我相信它有比这更多的功能,只是从来没有使用过更复杂的东西。

im guessing that your trying to parse html try Jericho我猜你试图解析 html 试试 Jericho

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM