简体   繁体   English

jsoup删除具有特定类的div

[英]jsoup remove div with a certain class

I have a list in jsoup like this: 我在jsoup有一个这样的列表:

Elements tbody = new Elements();

tbody might look like this ( ---- separates elements in tbody list): tbody可能看起来像这样( ----分隔tbody列表中的元素):

<td> 
 <div data-emission="56b2140adb6da7bf3cbf6228" class="mainCell"> 
  <a href="/tv/weather-country-12457/"> <span class="left">16:00</span> 
   <div> 
    <p>Weather - country</p> 
   </div> </a> 
 </div> 
 <div data-emission="56b2140adb6da7bf3cbf6237" class="mainCell shows pending"> 
  <a href="/shows/that's-70-show-550347/epi1201/"> <span class="left">16:10</span> 
   <div> 
    <p>That's 70 show</p> 
    <span class="info">epi. 1201, Show</span> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 5%"></u> </p> </a> 
 </div> </td>
 ---------------------------------------------------------------------------
 <td> 
 <div data-emission="56b23876db6da7bf3cbf6588" class="mainCell pending"> 
  <a href="/tv/weather-563806/"> <span class="left">16:10</span> 
   <div> 
    <p>Weather</p> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 51%"></u> </p> </a> 
 </div> 
 <div data-emission="56b23876db6da7bf3cbf6589" class="mainCell"> 
  <a href="/tv/animal-cops-2615/"> <span class="left">16:15</span> 
   <div> 
    <p>Animal Cops</p> 
    <span class="info">epi. 3079, Show</span> 
   </div> </a> 
 </div> 
 <div data-emission="56b23876db6da7bf3cbf658a" class="mainCell shows"> 
  <a href="/show/house-md-1601/odc137/"> <span class="left">16:30</span> 
   <div> 
    <p>House MD</p> 
    <span class="info">epi. 137, Show</span> 
   </div> </a> 
 </div> </td>
 ---------------------------------------------------------------------------
 <td> 
 <div data-emission="56b213b3db6da7bf3cbf61a1" class="mainCell movies pending"> 
  <a href="/movie/star-trek-564170/"> <span class="left">16:00</span> 
   <div> 
    <p>Star Trek</p> 
    <span class="info">Movie</span> 
    <span class="szh prem">| Premiere</span> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 21%"></u> </p> </a> 
 </div> </td>

My goal is to remove every movie/show that is pending/onAir. 我的目标是删除所有待处理/正在播放的电影/节目。 So in this example i would like to get rid of a whole div that has: 因此,在此示例中,我希望摆脱具有以下内容的整个div

  • that's 70 show
  • weather
  • star trek

fe: fe:

for(int i = 0; i < tbody.size(); i++){
            tbody.get(i).select("div").select("p").select(".onAir").remove();
        }

It removes only an element itself, not a whole div . 它仅删除元素本身,而不删除整个div I have tried in many ways but unsuccessfully. 我已经尝试了很多方法,但是都没有成功。 I will appreciate any help. 我将不胜感激。

It seems that the pending shows also carry the pending css class. 似乎未决的节目也带有pending CSS类。 If this is true for all cases you can do it very simply by: 如果在所有情况下都是如此,则可以通过以下方法非常简单地执行此操作:

doc.select("td>div.pending").remove();

This will remove all div elements with the pending class from the document doc. 这将从文档doc中删除所有带有pending类的div元素。 if they are direct children of a td element. 如果它们是td元素的直接子代。

Alternatively, you can use your approach and filter for the p element with the correct onAir class and inner text: 或者,您可以对p元素使用正确的onAir类和内部文本来使用方法和过滤器:

doc.select("td>div:has(p.onAir:contains(Pending))").remove();

See the CSS selector syntax to understand the power of Jsoup. 请参阅CSS选择器语法以了解Jsoup的功能。

Try following code snippet. 尝试以下代码段。

Elements mainCells = tbody.select("div.mainCell");
for(int i = 0; i < mainCells.size(); i++){
    Elements mainCellsP = mainCells.get(i).select("div").select("a").select("p");
    if (mainCellsP.size() == 2) {
        // Remove this node from DOM tree
        mainCells.get(i).remove();
    }
}

First select the appropriate node you want to delete and then call remove() method of that node. 首先选择要删除的适当节点,然后调用该节点的remove()方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM