CSS 子选择器（无法选择所有子项）

Question

This is the image of what I'm trying to scrape using beautiful soup.这是我试图用美丽的汤刮的图像。 But whenever I use the code shown below, I only get access to the first child.但是每当我使用下面显示的代码时，我只能访问第一个孩子。 I am never able to get access to all the children.我永远无法接触到所有的孩子。 Can someone help me with this?有人可以帮我弄这个吗？

item = soup.select("ul.items > li")
print(len(item))

Answer 1

The problem can be fixed in 2 steps as follows:该问题可以通过以下 2 步解决：

Use select_one on soup to get the ul在汤上使用select_one来获取ul
Use find_all on ul to fetch all the li items.在ul上使用find_all来获取所有li项目。

Working solution:工作解决方案：

# File name: soup-demo.py

inputHTML = """
<ul class="items">
<li class="class1">item 1</li>
<li class="class1">item 3</li>
<li class="class1">item 3</li>
</ul>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(inputHTML, 'html.parser')
itemList = soup.select_one("ul", class_="items")

items = itemList.find_all("li")
print("Found ", len(items), " items")
for item in items:
    print(item)

Output:输出：

$ python3 soup-demo.py 
Found  3  items
<li class="class1">item 1</li>
<li class="class1">item 3</li>
<li class="class1">item 3</li>

Answer 2

Maybe your version is wrong.可能你的版本不对。 This is OK.还行吧。

from bs4 import BeautifulSoup
html = '''
<ul class="items">
  <li>1</li>
  <li>2</li>
</ul>
'''
soup = BeautifulSoup(html,features="lxml")
item = soup.select('ul.items>li')
print (len(item))

There's another solution here这里有另一个解决方案

from simplified_scrapy.simplified_doc import SimplifiedDoc
html = '''
<ul class="items">
  <li>1</li>
  <li>2</li>
</ul>
'''
doc = SimplifiedDoc(html)
item = doc.selects('ul.items>li')
print(len(item))

Here are more examples here下面是更多的例子在这里

CSS 子选择器（无法选择所有子项）

问题描述

2 个解决方案

解决方案1
0 2020-02-07 21:06:49

解决方案2
0 2020-02-08 01:25:51

CSS 子选择器（无法选择所有子项）

问题描述

2 个解决方案

解决方案1 0 2020-02-07 21:06:49

解决方案2 0 2020-02-08 01:25:51

解决方案1
0 2020-02-07 21:06:49

解决方案2
0 2020-02-08 01:25:51