[英]how to parse html file using Jsoup with multiple class-name element?
下面的Java代碼適用於具有css-sched-table-title之類的html文件。
但是我有多個類名稱可以在html文件中找到,例如css-sched-waypoints,css-sched-times。 如何在jsoup中使用getElementsByClass方法組合搜索。 我不想多次編寫代碼,因為我想保留訂單。 我的意思是我想要類似
doc.getElementsByClass(“ css-sched-table-title” || doc.getElementsByClass(“ css-sched-waypoints”);
Document doc = Jsoup.parse(content);
Elements ele = doc.getElementsByClass("css-sched-table-title");
for (Element link : ele) {
String linkText = link.text();
System.out.println(linkText);
}
。
<tr ALIGN="CENTER">
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:15</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:20</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:24</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:34</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:34</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:40</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:46</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">6:54</td>
</tr>
<tr VALIGN="BOTTOM">
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Townline and Southern</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Clearbrook and Blueridge</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Clearbrook and South Fraser</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Ar. Bourquin Exchange</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Lv. Bourquin Exchange</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Downtown Abbotsford</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">McMillan and Old Yale</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Sandy Hill and Old Clayburn</TD>
</tr>
<tr ALIGN="CENTER">
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:12</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:17</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:21</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:31</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:34</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:40</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:46</td>
<td CLASS="css-sched-times"> </td>
<td CLASS="css-sched-times">8:54</td>
</tr>
從您先前的查詢中獲得線索,當我嘗試通過有效的Selector
語法將3個td
組合在一起時,我得到了您期望的結果。
doc.select("td[class=css-sched-table-title], td[class=css-sched-waypoints], td[class=css-sched-times]")
請注意 ,您可以在選擇器語法中組合多個條件,例如: Elements row = doc.select("td[class=css-sched-table-title], td[class=css-sched-waypoints], td[class=css-sched-times]");
有效地成為您的OR
運算符。
Elements row = doc.select("td[class=css-sched-table-title], td[class=css-sched-waypoints], td[class=css-sched-times]");
System.out.println("::Total Count::" + row.size());
Iterator<Element> iterator = row.listIterator();
while (iterator.hasNext()) {
Element element = iterator.next();
String id = element.attr("id");
String classes = element.attr("class");
String value = element.text();
System.out.println("Id : " + id + ", classes : " + classes
+ ", value : " + value);
}
給人,
::Total Count::25
Id : , classes : css-sched-table-title, value : Saturday - Afternoon
Id : , classes : css-sched-waypoints, value : Townline and Southern
Id : , classes : css-sched-waypoints, value : Clearbrook and Blueridge
Id : , classes : css-sched-waypoints, value : Clearbrook and South Fraser
Id : , classes : css-sched-waypoints, value : Ar. Bourquin Exchange
Id : , classes : css-sched-waypoints, value : Lv. Bourquin Exchange
Id : , classes : css-sched-waypoints, value : Downtown Abbotsford
Id : , classes : css-sched-waypoints, value : McMillan and Old Yale
Id : , classes : css-sched-waypoints, value : Sandy Hill and Old Clayburn
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:15
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:20
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:24
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:34
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:34
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:40
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:46
Id : , classes : css-sched-times, value :
Id : , classes : css-sched-times, value : 6:54
有關Selector
語法的詳細用法,請參見此處。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.