![](/img/trans.png)
[英]How do I scrape data using Selenium in Python from a webpage that adds div on scroll?
[英]Scrape data from csv downloaded after right clicking on webpage using selenium python
我正在尋找使用 python 和 selenium 從網頁中抓取數據。 有一個 csv 下載選項,只有在圖形框架中單擊鼠標右鍵后才能看到該選項。 我無法右鍵單擊頁面並單擊 csv - 使用 selenium 下載選項。 這是我試圖從中獲取數據的 web 頁面的鏈接 - https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/ :
options = webdriver.ChromeOptions()
options.binary_location = r"<Path where chrome application is installed>"
driver = webdriver.Chrome(r"<path to chrome driver>",chrome_options=options)
driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
timeout = 10
from selenium.webdriver import ActionChains
action = ActionChains(driver)
action.move_to_element(driver.find_element_by_xpath("//lego-canvas-container[@class='lego-canvas-container']")).perform()
action.context_click().perform()
使用它,無法找到給定的 XPATH,甚至嘗試使用 class 名稱(如報告區域)。 誰能指導一下如何右鍵單擊框架中的任何位置,然后在其中找到下載 csv 選項?
由於 javascript 在右鍵單擊后可見,因此沒有右鍵單擊無法找到 xpath 嘗試此代碼對我有用
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("start-maximized")
chrome_options.add_argument("disable-infobars")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(executable_path = 'chromedriver.exe',options = chrome_options)
driver.implicitly_wait(10)
driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
action = ActionChains(driver)
action.pause(1)
action.move_by_offset(150,150).perform()
action.context_click().perform()
action.move_to_element(driver.find_element_by_xpath('//*[@id="mat-menu-panel-0"]/div/span[5]/button')).perform()
action.click().perform()
使用下面的xpath
識別元素,然后右鍵單擊,然后找到 csv 按鈕並單擊。
driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
time.sleep(5) #delay to load page properly. you can use explicit wait as well
element=driver.find_element_by_xpath("//div[@class='drop-zone-text']")
action = ActionChains(driver)
action.move_to_element(element).perform()
action.context_click().perform()
#To click on download csv
WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(.,'Download CSV')]"))).click()
您需要導入以下庫
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
瀏覽器快照:
我發現右鍵單擊然后使用箭頭鍵 select 適當的選項更容易一些。
因此,您可以在 canvas 的任意位置執行right click/context_click
,以打開菜單彈出窗口。 然后您可以使用箭頭鍵和 select 上下移動“下載 Csv”選項。
actions = ActionChains(driver)
# Find the canvas element
element = driver.find_element_by_xpath('//*[@id="body"]/div/div/div[1]/div[2]/div/div[1]/div[1]/div[1]/div/lego-report/lego-canvas-container/div/file-drop-zone/span/content-section/div[3]/canvas-component')
# Right click the element, then press the Down key twice followed by the Enter to move to the Download CSV option and select it.
actions.move_to_element(element).context_click().send_keys([Keys.DOWN, Keys.DOWN, Keys.ENTER]).perform()
driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
time.sleep(5)
source= wait.until(EC.presence_of_element_located((By.XPATH,"/html/body")))
action = ActionChains(driver)
action.context_click(source).perform()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#mat-menu-panel-0 > div > span:nth-child(5) > button"))).click()
奇怪的是讓它與這個一起工作。 似乎您需要等待,上下文單擊正文,然后單擊菜單元素。
<button _ngcontent-fys-c1="" class="mat-focus-indicator mat-tooltip-trigger mat-menu-item ng-star-inserted" mat-menu-item="" role="menuitem" tabindex="0" aria-disabled="false"> Download CSV <!----><!----><!----><div class="mat-menu-ripple mat-ripple" matripple=""></div></button>
進口
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.