简体   繁体   English

如何在Python中使用Selenium提取文本元素?

[英]How to extract the text elements using Selenium in Python?

在此处输入图片说明

I am using Selenium to scrape the contents from app store: https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830 我正在使用Selenium从应用商店中抓取内容: https : //apps.apple.com/us/app/bank-of-america-private-bank/id1096813830

I tried to extract the text field "As subject matter experts, our team is very engaging..." 我尝试提取文本字段“作为主题专家,我们的团队非常有魅力……”

I tried to find elements by class 我试图按班级查找元素

review_ratings = driver.find_elements_by_class_name('we-truncate we-truncate--multi-line we-truncate--interactive ember-view we-customer-review__body')
review_ratingsList = []
for e in review_ratings:
review_ratingsList.append(e.get_attribute('innerHTML'))
review_ratings

But it returns an empty list [] 但它返回一个空列表[]

Anything wrong with the code? 代码有什么问题吗? Or any better solutions? 还是更好的解决方案? Thanks for your help. 谢谢你的帮助。

Using requests and BeautifulSoup : 使用requestsBeautifulSoup

import requests
from bs4 import BeautifulSoup

url = 'https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830'

res = requests.get(url)
soup = BeautifulSoup(res.text,'lxml')
item = soup.select_one("blockquote > p").text
print(item)

Output: 输出:

As subject matter experts, our team is very engaging and focused on our near and long term financial health!

You can use WebDriverWait to wait for visibility of element and get text. 您可以使用WebDriverWait等待元素的可见性并获取文本。 Please check good selenium locator . 请检查良好的硒定位器

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#...

wait = WebDriverWait(driver, 5)
review_ratings = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".we-customer-review")))
for review_rating in review_ratings:
    starts = review_rating.find_element_by_css_selector(".we-star-rating").get_attribute("aria-label")
    title = review_rating.find_element_by_css_selector("h3").text
    review = review_rating.find_element_by_css_selector("p").text

May I suggest mixing selenium with BeautifulSoup ? 我可以建议将seleniumBeautifulSoup混合使用吗? Using webdriver: 使用网络驱动程序:

from bs4 import BeautifulSoup
from selenium import webdriver
browser=webdriver.Chrome()
url = "https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830"
browser.get(url)
innerHTML = browser.execute_script("return document.body.innerHTML")

bs = BeautifulSoup(innerHTML, 'html.parser')

bs.blockquote.p.text

Output: 输出:

Out[22]: 'As subject matter experts, our team is very engaging and focused on our near and long term financial health!'

If there's something to be explained, just tell me! 如果有什么要解释的,请告诉我!

Use WebDriverWait and wait for presence_of_all_elements_located and use following Css Selector. 使用WebDriverWait ,等待presence_of_all_elements_located和使用下面的CSS选择器。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830")
review_ratings =WebDriverWait(driver,20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'.we-customer-review__body p[dir="ltr"]')))
review_ratingsList = []
for e in review_ratings:
 review_ratingsList.append(e.get_attribute('innerHTML'))
print(review_ratingsList)

Output: 输出:

['As subject matter experts, our team is very engaging and focused on our near and long term financial health!', 'Very much seems to be an unfinished app. Can’t find secure message alert. Or any alerts for that matter. Most of my client team is missing from the “send to” list. I have other functions very useful, when away from my computer.']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Selenium 和 Python 从通过 xpath 找到的 webdriver 元素中提取文本 - How to extract text from webdriver elements found through xpath using Selenium and Python 如何使用 python selenium 从 span 元素中提取多个文本? - How to extract multiple texts from span elements using python selenium? 如何使用 Selenium 和 Python 从 html 获取文本,它有两个具有相同类名的元素,我需要提取这两个元素 - How to get text from a html using Selenium and Python which has two elements with the same classname where I need to extract both 如何使用Selenium WebDriver和Python提取元素中的文本? - How to extract the text within the element using Selenium WebDriver and Python? 如何使用 Selenium 和 Python 从 HTML 中提取文本 - How to extract the text from the HTML using Selenium and Python 如何使用 Selenium 和 Python 从 html 中提取文本 H MATTHEWS - How to extract the text H MATTHEWS from the html using Selenium and Python 如何使用 Selenium 和 Python 从多个 div class 中提取文本 - How to extract text from multiple div class using Selenium with Python 如何通过Python使用Selenium从网页中提取文本$7.56 - How to extract the text $7.56 from the webpage using Selenium through Python 如何使用 Selenium 和 Python 从 webelements 中提取文本 - How to extract the text from the webelements using Selenium and Python 如何使用 Python Selenium 从 angular 表单中提取输入文本? - How to extract the input text from angular form using Python Selenium?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM