[英]web scraping to download pdf file from web using python but getting blank file
我正在嘗試使用 python 從網絡下載 pdf 報告,但是代碼最后返回一個空白的 pdf 報告,我是否知道代碼有什么問題以及我哪里出錯了。
==============================================
from BeautifulSoup import BeautifulSoup
import urllib2
import re
html_page = urllib2.urlopen("http://www.imd.gov.in/Welcome%20To%20IMD/Welcome.php")
soup = BeautifulSoup(html_page)
b = soup.findAll('a', attrs={'href': re.compile("^http://hydro.imd.gov.in/hydrometweb/")})
c = b[0]['href']
d = c[0:len(c)-12]
e = d + "PdfReportPage.aspx?ImgUrl=PRODUCTS/Rainfall_Statistics/Cumulative/District_RF_Distribution/DISTRICT_RAINFALL_DISTRIBUTION_COUNTRY_INDIA_cd.PDF"
def download_file(download_url):
response = urllib2.urlopen(download_url)
file = open("document.pdf", 'w')
file.write(response.read())
file.close()
print("Completed")
download_file(e)
使用二進制模式b
前任:
def download_file(download_url):
response = urllib2.urlopen(download_url)
with open("document.pdf", 'wb') as outfile:
outfile.write(response.read())
print("Completed")
download_file(e)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.