[英]Failing to download a zipped csv file using python requests library
我正在嘗試從https://www1.nseindia.com/products/content/derivatives/equities/archieve_fo.htm下載壓縮的 csv 文件。
要查看文件,請選擇報告:Bhavcopy 和日期:18-06-2021。 它提供帶有 url https://www1.nseindia.com/content/historical/DERIVATIVES/2021/JUN/fo18JUN2021bhav.csv.zip 的zip 文件 fo18JUN2021bhav.csv.zip
現在,當我使用請求庫對壓縮的 csv url(帶有適當的標頭)進行 get 調用時,我收到 [404] 錯誤。
有沒有辦法以編程方式下載它? 謝謝 !
要獲取文件,請設置User-Agent
和Referer
HTTP 標頭:
import requests
from bs4 import BeautifulSoup
# change filetype and date you want to search for:
url = "https://www1.nseindia.com/ArchieveSearch?h_filetype=fobhav&date=18-06-2021§ion=FO"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0",
"Referer": "https://www1.nseindia.com/products/content/derivatives/equities/archieve_fo.htm",
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
csv_url = "https://www1.nseindia.com" + soup.a["href"]
print("Downloading {}...".format(csv_url))
with open(csv_url.split("/")[-1], "wb") as f_out:
f_out.write(requests.get(csv_url, headers=headers).content)
print("Done.")
下載fo18JUN2021bhav.csv.zip
文件:
Downloading https://www1.nseindia.com/content/historical/DERIVATIVES/2021/JUN/fo18JUN2021bhav.csv.zip...
Done.
$ ls -alF fo18JUN2021bhav.csv.zip
-rw-r--r-- 1 root root 632963 june 20 23:11 fo18JUN2021bhav.csv.zip
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.