简体   繁体   English

为什么我的 selenium 代码没有显示任何文本?

[英]Why my selenium code isn't showing any text?

I want to scrape the whole website with selenium.我想用硒来抓取整个网站。 I got one class of a product name in a website.我在网站上获得了一类产品名称。 I just want to get all the product names under one class name.我只想在一个类名下获取所有产品名称。 Without manually copying any id's or XPATH's for each and every product.无需为每个产品手动复制任何 id 或 XPATH。 I have done it by doing this but:我已经这样做了,但是:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

driver_exe = 'chromedriver'
options = Options()
options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r"C:\Users\intel\Downloads\Setups\chromedriver.exe", options=options)

driver.get("https://www.justdial.com/Bangalore/Bakeries")
x = driver.find_elements_by_class_name("store-name")

for i in x:
    print(i.text)

It's not displaying anything.它不显示任何东西。 Why???为什么??? Any parsers like beautiful soup will also be accepted mixed with selenium but I want selenium anyways...任何像美丽汤这样的解析器也可以与硒混合,但无论如何我想要硒......

Bypass Access Denied in headless mode solution use different user-agent, reference .在无头模式解决方案中绕过Access Denied使用不同的用户代理, 参考. You can use your own user agent, google "my user agent" to get it.您可以使用自己的用户代理,谷歌“我的用户代理”来获取它。

user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) ' \
             'Chrome/80.0.3987.132 Safari/537.36'
driver_exe = 'chromedriver'
options = ChromeOptions()
options.add_argument("--headless")
options.add_argument(f'user-agent={user_agent}')
driver = webdriver.Chrome(options=options)

driver.get("https://www.justdial.com/Bangalore/Bakeries")
x = driver.find_elements_by_class_name("store-name")

for i in x:
    print(i.text)

Using and 使用

from bs4 import BeautifulSoup
import requests

user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) ' \
             'Chrome/80.0.3987.132 Safari/537.36'

response = requests.get('https://www.justdial.com/Bangalore/Bakeries', headers={'user-agent': user_agent})
soup = BeautifulSoup(response.text, 'lxml')
stores = soup.select('.store-name')
for store in stores:
    print(store.text.strip())

Output:输出:

Big Mishra Pedha
Just Bake
The Cake Factory
Queen Of Cakeland
SREENIVASA BRAHMINS BAKERY ..
Aubree Eat Play Love Chocol..
Jain Bakes
Ammas Pastries
Facebake
Holige Mane Brahmins Bakery

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM