简体   繁体   中英

Unable to download url link with requests in Python

The objective is to download a tar.gz from a cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz

The file can be downloaded without any issue with wget .

!wget cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz --no-check-certificate

However, the download the file using requests

import requests
url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
r = requests.get(url) 

Return an error

MissingSchema                             Traceback (most recent call last)

<ipython-input-11-fa35f2c0ddc0> in <module>()
      1 url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
----> 2 r = requests.get(url)

5 frames

/usr/local/lib/python3.7/dist-packages/requests/models.py in prepare_url(self, url, params)
    386             error = error.format(to_native_string(url, 'utf8'))
    387 
--> 388             raise MissingSchema(error)
    389 
    390         if not host:

MissingSchema: Invalid URL 'cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz': No schema supplied. Perhaps you meant http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz?

May I know what is the issue?

您的url变量中缺少 http:// 或 https:// (架构,如错误消息所述)。

url = 'https://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'

You miss the http header

import requests
requests.get("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz")

This should also work

import wget
wget.download("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz", out="YOUR_PATH")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM