![](/img/trans.png)
[英]How can I remove spaces in between HTML tags using BeautifulSoup in Python?
[英]how can I remove this one in beautifulsoup python
我只想使用 bs 将文本保存<\/strong>在下面的 html 中并丢弃文本del<\/strong> 。 我怎样才能做到这一点?
<div>
<span class="chk_box">
<input id="subj5" name="subj" onclick="subjSel();" type="checkbox" value="5"/>
<label for="subj5">
SAVE1
</label>
</span>
<span class="chk_box">
<input id="subj6" name="subj" onclick="subjSel();" type="checkbox" value="6"/>
<label for="subj6">
SAVE2
</label>
</span>
<span class="chk_box">
<input disabled="" id="subj7" name="subj" onclick="subjSel();" type="checkbox" value="7"/>
<label for="" subj7""="">
DEL1
</label>
</span>
<span class="chk_box">
<input disabled="" id="subj8" name="subj" onclick="subjSel();" type="checkbox" value="8"/>
<label for="subj78">
DEL2
</label>
</span>
</div>
看起来区别特征是您要提取的项目在
input<\/code>中缺少
disabled=''<\/code> 。
因此,您可以对此进行过滤:
from bs4 import BeautifulSoup
html = '''<div>
<span class="chk_box">
<input id="subj5" name="subj" onclick="subjSel();" type="checkbox" value="5"/>
<label for="subj5">
SAVE1
</label>
</span>
<span class="chk_box">
<input id="subj6" name="subj" onclick="subjSel();" type="checkbox" value="6"/>
<label for="subj6">
SAVE2
</label>
</span>
<span class="chk_box">
<input disabled="" id="subj7" name="subj" onclick="subjSel();" type="checkbox" value="7"/>
<label for="" subj7""="">
DEL1
</label>
</span>
<span class="chk_box">
<input disabled="" id="subj8" name="subj" onclick="subjSel();" type="checkbox" value="8"/>
<label for="subj78">
DEL2
</label>
</span>
</div>'''
soup = BeautifulSoup(html)
results = [i.find_next_sibling().get_text().strip() for i in soup.find_all('input', {'disabled':None})]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.