Python 在缺少“类”的 url 上提取“beautifulsoup”，其他属性？

Question

Quick question [I am not very familiar with Python's BeautifulSoup() ] If I have the following element,快速提问 [我对 Python 的BeautifulSoup()不是很熟悉] 如果我有以下元素，

how can I extract/get "1 comment" (or, "2 comments", etc.)?如何提取/获取“1 条评论”（或“2 条评论”等）？ There is no class (or id , or other attributes) in that " a " tag.该“ a ”标签中没有class （或id或其他属性）。

<td class="subtext">
  <a href="item?id=22823679">1&nbsp;comment</a>
</td>

Answer 1

You can use select method to apply a querySelect into your html, and then take the contents of the elements you found:您可以使用select方法将 querySelect 应用到您的 html 中，然后获取您找到的元素的contents ：

elements = soup.select(".subtext a")
[x.contents for x in elements]

Answer 2

How about the following, test with local html file下面怎么样，用本地html文件测试

from bs4 import BeautifulSoup

url = "D:\\Temp\\example.html"

with open(url, "r") as page:
    contents = page.read()
    soup = BeautifulSoup(contents, 'html.parser')
    element = soup.select('td.subtext')
    value = element[0].get_text()
    print(value)

example.html例子.html

<html>
    <head></head>
        <body>
            <td class="subtext">
                <a href="item?id=22823679">1&nbsp;comment</a>
            </td>
        </body>
</html>

Python 在缺少“类”的 url 上提取“beautifulsoup”，其他属性？

问题描述

2 个解决方案

解决方案1
1 2020-04-10 01:35:49

解决方案2
1 已采纳 2020-04-10 02:31:06

Python 在缺少“类”的 url 上提取“beautifulsoup”，其他属性？

问题描述

2 个解决方案

解决方案1 1 2020-04-10 01:35:49

解决方案2 1 已采纳 2020-04-10 02:31:06

解决方案1
1 2020-04-10 01:35:49

解决方案2
1 已采纳 2020-04-10 02:31:06