[英]jsoup remove div with a certain class
I have a list in jsoup
like this: 我在jsoup
有一个这样的列表:
Elements tbody = new Elements();
tbody
might look like this ( ----
separates elements in tbody
list): tbody
可能看起来像这样( ----
分隔tbody
列表中的元素):
<td>
<div data-emission="56b2140adb6da7bf3cbf6228" class="mainCell">
<a href="/tv/weather-country-12457/"> <span class="left">16:00</span>
<div>
<p>Weather - country</p>
</div> </a>
</div>
<div data-emission="56b2140adb6da7bf3cbf6237" class="mainCell shows pending">
<a href="/shows/that's-70-show-550347/epi1201/"> <span class="left">16:10</span>
<div>
<p>That's 70 show</p>
<span class="info">epi. 1201, Show</span>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 5%"></u> </p> </a>
</div> </td>
---------------------------------------------------------------------------
<td>
<div data-emission="56b23876db6da7bf3cbf6588" class="mainCell pending">
<a href="/tv/weather-563806/"> <span class="left">16:10</span>
<div>
<p>Weather</p>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 51%"></u> </p> </a>
</div>
<div data-emission="56b23876db6da7bf3cbf6589" class="mainCell">
<a href="/tv/animal-cops-2615/"> <span class="left">16:15</span>
<div>
<p>Animal Cops</p>
<span class="info">epi. 3079, Show</span>
</div> </a>
</div>
<div data-emission="56b23876db6da7bf3cbf658a" class="mainCell shows">
<a href="/show/house-md-1601/odc137/"> <span class="left">16:30</span>
<div>
<p>House MD</p>
<span class="info">epi. 137, Show</span>
</div> </a>
</div> </td>
---------------------------------------------------------------------------
<td>
<div data-emission="56b213b3db6da7bf3cbf61a1" class="mainCell movies pending">
<a href="/movie/star-trek-564170/"> <span class="left">16:00</span>
<div>
<p>Star Trek</p>
<span class="info">Movie</span>
<span class="szh prem">| Premiere</span>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 21%"></u> </p> </a>
</div> </td>
My goal is to remove every movie/show that is pending/onAir. 我的目标是删除所有待处理/正在播放的电影/节目。 So in this example i would like to get rid of a whole div
that has: 因此,在此示例中,我希望摆脱具有以下内容的整个div
:
that's 70 show
weather
star trek
fe: fe:
for(int i = 0; i < tbody.size(); i++){
tbody.get(i).select("div").select("p").select(".onAir").remove();
}
It removes only an element itself, not a whole div
. 它仅删除元素本身,而不删除整个div
。 I have tried in many ways but unsuccessfully. 我已经尝试了很多方法,但是都没有成功。 I will appreciate any help. 我将不胜感激。
It seems that the pending shows also carry the pending
css class. 似乎未决的节目也带有pending
CSS类。 If this is true for all cases you can do it very simply by: 如果在所有情况下都是如此,则可以通过以下方法非常简单地执行此操作:
doc.select("td>div.pending").remove();
This will remove all div
elements with the pending
class from the document doc. 这将从文档doc中删除所有带有pending
类的div
元素。 if they are direct children of a td
element. 如果它们是td
元素的直接子代。
Alternatively, you can use your approach and filter for the p
element with the correct onAir
class and inner text: 或者,您可以对p
元素使用正确的onAir
类和内部文本来使用方法和过滤器:
doc.select("td>div:has(p.onAir:contains(Pending))").remove();
See the CSS selector syntax to understand the power of Jsoup. 请参阅CSS选择器语法以了解Jsoup的功能。
Try following code snippet. 尝试以下代码段。
Elements mainCells = tbody.select("div.mainCell");
for(int i = 0; i < mainCells.size(); i++){
Elements mainCellsP = mainCells.get(i).select("div").select("a").select("p");
if (mainCellsP.size() == 2) {
// Remove this node from DOM tree
mainCells.get(i).remove();
}
}
First select the appropriate node you want to delete and then call remove() method of that node. 首先选择要删除的适当节点,然后调用该节点的remove()方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.