简体   繁体   English

web 抓取隐藏的 DIV,仅通过单击网页中的按钮显示

[英]web scraping for a hidden DIV only showing by click a button in the webpage

I need to scrape data from a website, there is a hidden div not showing until you click a button in the website.我需要从网站上抓取数据,直到您单击网站中的按钮,才会显示隐藏的 div。 when I use code to get html content, I cannot get the hidden div content even if I can see the hidden div data in "Inspect"当我使用代码获取 html 内容时,即使在“检查”中可以看到隐藏的 div 数据,我也无法获取隐藏的 div 内容

Details of url, code and hidden DIV are as below: url的详细信息,代码和隐藏DIV如下:

import requests
import bs4

url = 'https://so.gushiwen.org/guwen/bookv_3694.aspx'
doc=requests.get(url)
print(bs4.BeautifulSoup(doc.text, "html.parser"))

在此处输入图像描述

You can use selenium to locate the desired div by id and use soup.send_keys('\n') :您可以使用selenium通过 id 定位所需的div并使用soup.send_keys('\n')

from selenium import webdriver
d = webdriver.Chrome('/path/to/chromedriver')
d.get('https://so.gushiwen.org/guwen/bookv_3694.aspx')
d.find_element_by_id('right2321').send_keys('\n')

Now, you can use BeautifulSoup to scrape your desired content via:现在,您可以使用BeautifulSoup通过以下方式抓取您想要的内容:

from bs4 import BeautifulSoup as soup
content = soup(d.page_source, 'html.parser').find('div', {'id':'right2321'}).text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM