[英]Python - Scraping a PDF file from a URL
I want to scrape pdf files from this site https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_pilote_7b.pdf I tried this code for that but it doesn't work. 誰能告訴我為什么?
res = requests.get('https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_7b.pdf')
with open('C:\\Users\\sioud\\Desktop\\Manuels scolaires TN\\1\\test.pdf', 'wb') as f:
f.write(ress.content)
res = requests.get('https://www.sigmaths.net/manuels/ph/physique_7b.pdf',stream=True)
with open('test.pdf', 'wb') as f:
f.write(res.content)
your url is pointing to a reader https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_7b.pdf
, remove the 'reader.php?var=
for the actual pdf
您也可以使用urlretrieve
。 查看我的解決方案代碼。
from urllib.request import urlretrieve
pdfurl = u"https://www.sigmaths.net/manuels/ph/physique_7b.pdf";
urlretrieve(pdfurl, "test.pdf")
並且你會發現需要的pdf下載名為test.pdf
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.