简体   繁体   English

如何使用java中的JSoup通过表id解析html表数据

[英]How to parse a html table data through table id using JSoup in java

I need to store my client's table data into database. 我需要将客户端的表数据存储到数据库中。

There are n number of tables for which they have not provided any table class (directly using just Table_id in web page). 有n个表没有提供任何表类(在网页中直接使用Table_id)。

Example: 例:

[table width="100%" border="0" cellpadding="0" cellspacing="0"  id="AutoNumber5" style="border-collapse: collapse" bordercolor="#111111"]<br/>
[table width="100%" border="0" cellpadding="0" cellspacing="0"  id="AutoNumber4" style="border-collapse: collapse" bordercolor="#111111" ]

If there is a Table Class, obviously i can parse it easily, but there is no class just id is given in table. 如果有一个Table Class,显然我可以很容易地解析它,但是没有类只是id在表中给出。

I know there would be only one word syntax, except 我知道只有一个单词的语法,除了

for (Element table : doc.select("table") 

Maybe I could not find it. 也许我找不到它。 How to find it ? 怎么找到它? I have tried 我试过了

for (Element table : doc.select("table.AutoNumber5")

But it's not working for me. 但这对我不起作用。

How to fix this? 如何解决这个问题?

Try this 试试这个

doc.select("table#AutoNumber5");

It worked for me. 它对我有用。

Reference : http://jsoup.org/apidocs/org/jsoup/select/Selector.html 参考: http //jsoup.org/apidocs/org/jsoup/select/Selector.html

jsoup support css selectors and if you know the css it is easy to use like this: jsoup支持css选择器,如果你知道css它很容易使用,如下所示:

Document doc = Jsoup.connect(" http://xxxxxxxx.com/ ").get(); Document doc = Jsoup.connect(“ http://xxxxxxxx.com/ ”)。get();

Elements el = doc.select("#targeted-elemnet-id"); 元素el = doc.select(“#targeted-elemnet-id”);

you only need to replace your element id after # sign without space. 你只需要在没有空格的#符号后替换你的元素id。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM