[英]CSS children selector (not being able to select all children)
This is the image of what I'm trying to scrape using beautiful soup.这是我试图用美丽的汤刮的图像。 But whenever I use the code shown below, I only get access to the first child.
但是每当我使用下面显示的代码时,我只能访问第一个孩子。 I am never able to get access to all the children.
我永远无法接触到所有的孩子。 Can someone help me with this?
有人可以帮我弄这个吗?
item = soup.select("ul.items > li")
print(len(item))
The problem can be fixed in 2 steps as follows:该问题可以通过以下 2 步解决:
Working solution:工作解决方案:
# File name: soup-demo.py
inputHTML = """
<ul class="items">
<li class="class1">item 1</li>
<li class="class1">item 3</li>
<li class="class1">item 3</li>
</ul>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(inputHTML, 'html.parser')
itemList = soup.select_one("ul", class_="items")
items = itemList.find_all("li")
print("Found ", len(items), " items")
for item in items:
print(item)
Output:输出:
$ python3 soup-demo.py
Found 3 items
<li class="class1">item 1</li>
<li class="class1">item 3</li>
<li class="class1">item 3</li>
Maybe your version is wrong.可能你的版本不对。 This is OK.
还行吧。
from bs4 import BeautifulSoup
html = '''
<ul class="items">
<li>1</li>
<li>2</li>
</ul>
'''
soup = BeautifulSoup(html,features="lxml")
item = soup.select('ul.items>li')
print (len(item))
There's another solution here这里有另一个解决方案
from simplified_scrapy.simplified_doc import SimplifiedDoc
html = '''
<ul class="items">
<li>1</li>
<li>2</li>
</ul>
'''
doc = SimplifiedDoc(html)
item = doc.selects('ul.items>li')
print(len(item))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.