繁体   English   中英

Python - 从 URL 中抓取 PDF 文件

[英]Python - Scraping a PDF file from a URL

I want to scrape pdf files from this site https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_pilote_7b.pdf I tried this code for that but it doesn't work. 谁能告诉我为什么?

res = requests.get('https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_7b.pdf')
with open('C:\\Users\\sioud\\Desktop\\Manuels scolaires TN\\1\\test.pdf', 'wb') as f:
f.write(ress.content)
res = requests.get('https://www.sigmaths.net/manuels/ph/physique_7b.pdf',stream=True)
with open('test.pdf', 'wb') as f:
    f.write(res.content)

your url is pointing to a reader https://www.sigmaths.net/Reader.php?var=manuels/ph/physique_7b.pdf , remove the 'reader.php?var= for the actual pdf

您也可以使用urlretrieve 查看我的解决方案代码。

from urllib.request import urlretrieve
pdfurl = u"https://www.sigmaths.net/manuels/ph/physique_7b.pdf";
urlretrieve(pdfurl, "test.pdf")

并且你会发现需要的pdf下载名为test.pdf

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM