[英]Specifying what data to scrape - Jsoup + Android Studio
I am using JSoup to scrape data and display in on my phone using android studio. 我正在使用JSoup来抓取数据并使用android studio在我的手机上显示。 I have code that will scrape all the
<td>
tags however i am not trying to scrape them all, just certain ones in a certain order. 我有代码将刮掉所有的
<td>
标签但是我并不是想把它们全部刮掉,只是按某种顺序抓住它们。
</tr>
</table>
</td>
</tr><tr>
<td>
<table cellspacing='0' border='0' width='100%' >
<col align='left' /><col align='center' /><col align='right' />
<tr>
<td></td><td></td><td></td>
Also when it displays on my phone the <td>
is being displayed and I don't want them to. 此外,当它显示在我的手机上时,
<td>
正在显示,我不希望它们显示。 I don't want to scrape any of the <td>
tags from the html above 我不想从上面的html中删除任何
<td>
标签
<td bgcolor='#C0C0C0' colspan='1'><font color='#FFFFFF'>9:00</font></td>
<td bgcolor='#C0C0C0' colspan='1'><font color='#FFFFFF'>9:15</font></td>
<td bgcolor='#C0C0C0' colspan='1'><font color='#FFFFFF'>9:30</font></td>
<td bgcolor='#C0C0C0' colspan='1'><font color='#FFFFFF'>9:45</font></td>
Above and below is the HTML I want to scrape. 上面和下面是我要抓的HTML。
<tr >
<td style="border-bottom:3px solid #000000;" rowspan='1' bgcolor='#C0C0C0'><font color='#FFFFFF'>Mon</font></td>
<td style="border-bottom:3px solid #000000;" colspan='12' rowspan='1' >
<table cellspacing='0' border='0' width='100%'>
<col align='left' />
<tr>
<td align='left'><font color='#FF0000'>Sounds</font></td>
</tr>
</table>
<table cellspacing='0' border='0' width='100%'>
<col align='left' />
<col align='right' />
<tr>
<td align='left'><font color='#000000'>P0000</font></td>
<td align='right'><font color='#008000'>P.Man</font></td>
</tr>
</table>
What I want it to display is "Mon" then "9:00" then "Sounds" then "P0000" and then "P.Man. 我希望它显示的是“Mon”然后是“9:00”然后是“Sounds”然后是“P0000”然后是“P.Man。
This is the code I have atm. 这是我的代码。 Any one any clues?
任何一条线索? read the documentation.
阅读文档。
Elements tableElements = doc.select("td");
for (Element td : tableElements) {
buffer.append("TT [" + td + "] \r\n");
Log.d("JSwA", "TT [" + td + "]");
}
}
Try this CSS selector: 试试这个CSS选择器:
#post-15 > div > table:nth-child(6) > tbody > tr:nth-child(2) > td:nth-child(2) > table:not(:last-of-type)
String text = doc.select("#post-15 > div > table:nth-child(6) > tbody > tr:nth-child(2) > td:nth-child(2) > table:not(:last-of-type)").text();
// text should contain "Sounds P0000 P.Man"
The above code line tells Jsoup to find all the tables, except the last one ,containing the desired texts. 上面的代码行告诉Jsoup找到包含所需文本的所有表,除了最后一个表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.