用 Beautiful Soup 解析跨度：'NoneType' object 没有属性 'text'

Question

I'm trying to get all relationship names on a Linkedins's web page (the example: https://www.linkedin.com/in/diversiti/detail/skills/(ACoAACfEjjEBNLPrc1Y8OKosqroRRScfwaCdrxI,5)/ )我正在尝试获取 Linkedins 的 web 页面上的所有关系名称（示例： https://www.linkedin.com/in/diversiti/detail/skills/(ACoAACfEjRSNLPrc1Y8OKosqroR)/

(Please note the ')' char before the '5'). （请注意 '5' 之前的 ')' 字符）。

Here is a part of the html code:这是 html 代码的一部分：

<div class="pv-endorsement-entity__detail  pl3">
    <div class="pv-endorsement-entity__name t-16 t-black t-bold truncated-text">
        <span class="pv-endorsement-entity__name--has-hover">Vignesh G</span>
        <span data-test-distance-badge="" id="ember122"
            class="distance-badge t-black--light t-14 separator t-black--light ember-view"><span
                class="visually-hidden">
                out of network
            </span>
            <span class="dist-value" aria-hidden="true">3rd+</span>
        </span>
    </div>
    <div class="pv-endorsement-entity__headline t-14 t-black--light t-normal">
        Inventor | Engineer | MBA
    </div>
</div>

I want to get the name, so in this case "Vignesh G".我想得到这个名字，所以在这种情况下是“Vignesh G”。

Here is my python code:这是我的 python 代码：

from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
response = session.get('https://www.linkedin.com/in/diversiti/detail/skills/(ACoAACfEjjEBNLPrc1Y8OKosqroRRScfwaCdrxI,5)/')
soup = BeautifulSoup(response.content, 'html.parser')

content = soup.find('span', {'class': 'pv-endorsement-entity__name--has-hover'}).text

print(content)

Unfortunately I got this error:不幸的是，我收到了这个错误：

'NoneType' object has no attribute 'text'

I suppose that the span object is empty for BeautifulSoup, but how to get the text in this object?我想 BeautifulSoup 的跨度 object 是空的，但是如何获取这个 object 中的文本？

Answer 1

LinkedIn is loading the content later. LinkedIn 稍后会加载内容。 The initial content does not contain body tag.初始内容不包含body标签。 You should use selenium to simulate a browser.您应该使用selenium来模拟浏览器。

https://pypi.org/project/selenium/ https://pypi.org/project/selenium/

That way, you can load the URL and wait for the URL to load content completely.这样，您可以加载 URL 并等待 URL 完全加载内容。 It comes with utility functions such as find_element_by_tagname etc, which will work fine as a replacement for BeautifulSoup approach that you are currently taking.它带有诸如find_element_by_tagname等实用功能，可以很好地替代您当前采用的BeautifulSoup方法。

用 Beautiful Soup 解析跨度：'NoneType' object 没有属性 'text'

问题描述

1 个解决方案

解决方案1
1 2021-05-02 12:45:46

用 Beautiful Soup 解析跨度：'NoneType' object 没有属性 'text'

问题描述

1 个解决方案

解决方案1 1 2021-05-02 12:45:46

解决方案1
1 2021-05-02 12:45:46