简体   繁体   中英

Python3 HTML Find

I'm looking at trying an HTML scrape and I am trying to dig deep into a findAll.

The HTML is Below:

<div class="jdgm-rev-widg__reviews">
    <div class="jdgm-rev jdgm-divider-top" data-product-title="TITLE" data-product-url="PRODUCT-URL" data-thumb-down-count="0" data-thumb-up-count="0" data-verified-buyer="true">
        <div class="jdgm-rev__header">
            <div class="jdgm-rev__icon"> TEXT </div> <span aria-label="5 star review" class="jdgm-rev__rating" data-score="5" tabindex="0"> <a class="jdgm-star jdgm--on"></a><a class="jdgm-star jdgm--on"></a><a class="jdgm-star jdgm--on"></a><a class="jdgm-star jdgm--on"></a><a class="jdgm-star jdgm--on"></a> </span> <span class="jdgm-rev__timestamp jdgm-spinner" data-content="2019-12-24"> </span>
            <div class="jdgm-rev__br"></div> <span class="jdgm-rev__buyer-badge-wrapper"> <span class="jdgm-rev__buyer-badge"></span> </span> <span class="jdgm-rev__author-wrapper" data-all-initials="TEXT" data-fullname="TEXT" data-last-initial="TEXT" data-location-city="CITY" data-location-country="United States" data-location-country-code="US" data-location-state="STATE" data-location-state-code="WW"> <span class="jdgm-rev__author">TEXT</span> <span class="jdgm-rev__location"></span> </span>
        </div>

I'm trying to extract the data-score="5" from the class="jdgm-rev__rating"

This is what I have tried but can not get the right data.

score = container.findAll("span",{"data-score":" "})

I'm not sure how to dig into the span more to get the number since that number will be different every time since it's pulled data not exactly a class or id.

Thanks for any help.

Use {"data-score":True}

score = soup.findAll("span",{"data-score":True})
print(score[0]['data-score'])

Prints:

5

Or CSS-Selector:

score = container.select("span[data-score]")
print(score[0]['data-score'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM