[英]Python 2/3: Get name and extension of file at URL
When I download this file: https://drive.google.com/uc?export=download&id=0B4IfiNtPKeSATWZXWjEyd1FsRG8 当我下载此文件时: https : //drive.google.com/uc?export= download &id=0B4IfiNtPKeSATWZXWjEyd1FsRG8
Chrome knows that it is named testzip2.zip
and downloads it to the download folder with this name. Chrome知道它名为
testzip2.zip
并将其下载到具有此名称的下载文件夹中。
How can I get this name in Python (in a way that works in both Python 2.7 and 3.X)? 如何在Python中获得此名称(以在Python 2.7和3.X中都可以使用的方式)?
My previous approach: 我以前的方法:
response = urlopen(url)
header = response.headers['content-disposition']
original_file_name = next(x for x in header.split(';') if x.startswith('filename')).split('=')[-1].lstrip('"\'').rstrip('"\'')
Seems not to work reliably - it occasionally and randomly fails with KeyError: 'content-disposition'
, or AttributeError: 'NoneType' object has no attribute 'split'
似乎无法可靠地工作-偶尔并因
KeyError: 'content-disposition'
或AttributeError: 'NoneType' object has no attribute 'split'
而随机失败
You can use 您可以使用
import re
...
content_disposition = response.headers.get('Content-Disposition')
match = re.findall(r'filename="([\w\d\.]+)"', content_disposition)
filename = match[0]
However in Python 3, there is a handy method on the HTTPMessage
object to get the filename. 但是在Python 3中,
HTTPMessage
对象上有一个方便的方法来获取文件名。
filename = response.headers.get_filename() # python3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.