简体   繁体   中英

Finding an element using selenium and Python

I am trying to find the value count of reviews in the below html section, but I am unable to do so , have tried using class name, css selector etc, but it is unable to fid the element. Any help would be appreciated, below is the html section. I also have multiple such elements I ave to loop through and get the review counts, how do I do it?

<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews?utm_campaign=srp_ratings&amp;utm_medium=desktop&amp;utm_source=naukri" target="_blank" title="Powered by Ambition Box">(2148 Reviews)</a>

<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/dxc-technology-reviews?utm_campaign=srp_ratings&amp;utm_medium=desktop&amp;utm_source=naukri" target="_blank" title="Powered by Ambition Box">(3919 Reviews)</a>

You can get the element text with this:

all_text = driver.find_element_by_xpath("//a[contains(@href,'https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews')]").text

Now you can extract the count of reviews with this:

reviews = int(filter(str.isdigit, all_text))

Or with this:

import re

reviews = re.findall('\d+', all_text)

Don't forget wait / delay before accessing the element to make sure it is fully loaded

You can mix BeautifulSoap with selenium,

from bs4 import BeautifulSoup

data = """<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews?utm_campaign=srp_ratings&amp;utm_medium=desktop&amp;utm_source=naukri" target="_blank" title="Powered by Ambition Box">(2148 Reviews)</a>"""
soup = BeautifulSoup(data, 'html.parser')
finds = soup.find('a', {'class': 'reviewsCount'})
print(finds.text)

try with css selector :

a[href*='https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews'][class^='reviewsCount']

Code :

wait = WebDriverWait(driver, 10)
total_review_count = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a[href*='https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews'][class^='reviewsCount']")))
print(total_review_count.text)

Imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM