[英]How to download multiple excel files in same class name from website using Playwright
在這個網站( https://www.mca.gov.in/content/mca/global/en/data-and-reports/company-llp-info/incorporated-closed-month.html# )有多個 excel 文件一一具有相同的 class 名稱和不同的值數據。 我使用 playwright click()
function 從網站下載了一個 excel 文件。 我不知道如何在下載第一個文件后自動下載第二個文件。
這里的代碼,
import re
import asyncio
import requests
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless = False, slow_mo=50)
page = await browser.new_page()
web = "https://www.mca.gov.in/content/mca/global/en/data-and-reports/company-llp-info/incorporated-closed-month.html"
await page.goto(web)
await page.click('[class="expand-desk"]')
async with page.expect_download() as download_info:
await page.click('[class="doc-link download-file"]')
download = await download_info.value
print("download_url = ",download)
new = re.search("(?P<url>https?://[^\s ' ]+)", str(download)).group("url")
print("New url = ",new)
Filename = new.rsplit('=')[1]+".xlsx"
r = requests.get(new, allow_redirects=True)
open(Filename, 'wb').write(r.content)
await page.screenshot(path="report.png")
await page.pause()
await browser.close()
asyncio.run(main())
你能提出任何關於這些的想法嗎?
首先獲取一個元素,然后獲取所有元素的計數。 直到計數做一個for循環,然后使用第nth
元素,一個一個地點擊下載圖標。
downloadButtons = page.locator('img[alt="download-icon"]')
count = downloadButtons.count()
for i in range(count):
downloadButtons.nth(i).click()
也許這個簡單的例子會幫助你:
for CURRENT_XPATH in ['FIRST_XPATH', 'SECOND_XPATH']:
with page.expect_download() as download_info:
page.click(CURRENT_XPATH)
Download = await download_info.value
await Download.save_as(Download.suggested_filename)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.