简体   繁体   中英

Cannot scrape for infinite scrolling using Selenium

I am scraping tweets using Selenium for last 1 years but it cannot scroll the page beyond a point and pointing "Back to Top". How can I overcome this problem using Selenium?

Here is my code-

driver=webdriver.Firefox(executable_path="/home/piyush/geckodriver")
url="https://twitter.com/narendramodi"
driver.get(url)
time.sleep(6)

lastHeight = driver.execute_script("return document.body.scrollHeight")
while True:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(6)
    newHeight = driver.execute_script("return document.body.scrollHeight")
    if newHeight == lastHeight:
         break
    lastHeight = newHeight

Here is the output as image这是作为图像的输出

You can use something like the following. Try to wait with some timeout until "Back to Top" disappear and then continue to scrap.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
    disappeared = WebDriverWait(driver, 10).until(
        lambda x: not EC.visibility_of_element_located((By.ID, "myDynamicElement"))
    )

    if disappeared:
        print('Continue')
finally:
    driver.quit()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM