简体   繁体   English

如何在另一个跨度 class 内刮掉一个跨度?

[英]How to Scrape one of the span inside another span class?

<span class="sim-posted">
        
            <span class="jobs-status covid-icon clearfix">
                <i class="covid-home-icon"></i>Work from Home 
            </span>
            <span>Posted few days ago</span>
            
    </span>

I want to scrape last span tag with text "Posted few days ago" I have the code but its only scraping the first span with class我想用文本“几天前发布”刮掉最后一个跨度标签我有代码,但它只用 class 刮掉第一个跨度

date_published=job.find('span',class_='sim-posted').span.text

Try this, it will find another span without class inside the span that you reached试试这个,它会在你到达的跨度内找到另一个没有 class 的跨度

date_published=job.find('span',class_='sim-posted').find("span", {"class": False}).text

If it is always last <span> you can go with css selector last-of-type :如果它总是最后一个<span>你可以 go 和css selector last-of-type

soup.select_one('span.sim-posted span:last-of-type').text

Example例子

import requests
from bs4 import BeautifulSoup

html='''
<span class="sim-posted">
        
            <span class="jobs-status covid-icon clearfix">
                <i class="covid-home-icon"></i>Work from Home 
            </span>
            <span>Posted few days ago</span>
            
    </span>
'''
soup = BeautifulSoup(html, "html.parser")

soup.select_one('span.sim-posted span:last-of-type').text

Output Output

Posted few days ago

Alternativ另类

You can also go with :-soup-contains a css pseudo class selector to target a node's text.您还可以使用 go :-soup-contains css 伪 class 选择器来定位节点的文本。 Needs SoupSieve integration was added in Beautiful Soup 4.7.0. Beautiful Soup 4.7.0 中添加了需要 SoupSieve 集成。

soup.select_one('span.sim-posted span:-soup-contains("Posted")').text

To scrape the last SPAN tag with text as Posted few days ago using Selenium you can use either of the either of the following Locator Strategies :要使用Selenium使用几天前发布的文本刮取最后一个SPAN标记,您可以使用以下任一定位器策略

  • Using css with last-child :csslast-child一起使用:

     span.sim-posted span:last-child
  • Using css with last-of-type :csslast-of-type一起使用:

     span.sim-posted span:last-of-type
  • Using css with nth-child() :cssnth-child()一起使用:

     span.sim-posted span:nth-child(2)
  • Using css with nth-of-type() :cssnth-of-type()一起使用:

     span.sim-posted span:nth-of-type(2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM