简体   繁体   English

如何使用jsoup从具有多个类的td获取数据

[英]How to get data from td with multiple classes using jsoup

From the web page i'm trying to extract the data which contains following text too in which i got stuck at td with mutiple classes.我试图从网页中提取包含以下文本的数据,其中我在 td 中遇到了多个类。 i'm unable to get the data which from td of table with multiple classes.我无法从具有多个类的表的 td 中获取数据。

<div class="Uia">
<div class="eXa Iqc">
<div class="wna fa-Lsa Ala">
<div class="Cr Aha">Contact info</div>
<div class="y4">
<table class="Mlb">
<tbody>
<tr>
<td class="MAa">Address</td>
<td class="QLa adr">
<div class="PHb">
<div>
1600 Amphitheatre Pkwy
Mountain View, CA 94043
United States
</div>
</div></td>
</tr>
<tr>
<td colspan="2"></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>

I am trying to extract the address which is in td class="QLa adr".我正在尝试提取 td class="QLa adr" 中的地址。 please help me.请帮我。

System.out.println("ADDRESS  : " +doc.select("div.Uia > div.eXa.Iqc > div.wna.fa-Lsa.Ala > div.y4 > table[class=Mlb] > tbody > tr > td[class=QLa adr] > div").text());

You don't have to use such a complicated expression to get the classes, you can select them directly.您不必使用如此复杂的表达式来获取类,您可以直接选择它们。

In addition, the [] syntax is for selecting attributes - as with normal CSS selectors, classes are selected by prefixing the name with .此外, []语法用于选择属性 - 与普通 CSS 选择器一样,通过在名称前加上前缀来选择类. . .

In this case, assuming your website is loaded from a String (although obviously you could load it using connect ), to get your text you can use在这种情况下,假设您的网站是从 String 加载的(尽管显然您可以使用connect加载它),以获取您可以使用的文本

Document doc = Jsoup.parse(soup);
Elements extractedClasses = doc.select(".QLa.adr");

System.out.println(extractedClasses.text());

This prints out这打印出来

1600 Amphitheatre Pkwy Mountain View, CA 94043 United States

I could able to get it我可以得到它

System.out.println("ADDRESS  : " +doc.select("div.Uia > div.eXa.Iqc > div.wna.fa-Lsa.Ala > div.y4 > table[class=Mlb] > tbody > tr > td[class=QLa adr] > div").text());

System.out.println("ADDRESS  : " +doc.select("div.Uia > div.eXa.Iqc > div.wna.fa-Lsa.Ala > div.y4 > table[class=Mlb] > tbody > tr > td.QLa.adr > div").text());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM