简体   繁体   English

Python 2/3:在URL获取文件的名称和扩展名

[英]Python 2/3: Get name and extension of file at URL

When I download this file: https://drive.google.com/uc?export=download&id=0B4IfiNtPKeSATWZXWjEyd1FsRG8 当我下载此文件时: https : //drive.google.com/uc?export= download &id=0B4IfiNtPKeSATWZXWjEyd1FsRG8

Chrome knows that it is named testzip2.zip and downloads it to the download folder with this name. Chrome知道它名为testzip2.zip并将其下载到具有此名称的下载文件夹中。

How can I get this name in Python (in a way that works in both Python 2.7 and 3.X)? 如何在Python中获得此名称(以在Python 2.7和3.X中都可以使用的方式)?

My previous approach: 我以前的方法:

response = urlopen(url)
header = response.headers['content-disposition']
original_file_name = next(x for x in header.split(';') if x.startswith('filename')).split('=')[-1].lstrip('"\'').rstrip('"\'')

Seems not to work reliably - it occasionally and randomly fails with KeyError: 'content-disposition' , or AttributeError: 'NoneType' object has no attribute 'split' 似乎无法可靠地工作-偶尔并因KeyError: 'content-disposition'AttributeError: 'NoneType' object has no attribute 'split'而随机失败

You can use 您可以使用

import re
...

content_disposition = response.headers.get('Content-Disposition')
match = re.findall(r'filename="([\w\d\.]+)"', content_disposition)
filename = match[0]

However in Python 3, there is a handy method on the HTTPMessage object to get the filename. 但是在Python 3中, HTTPMessage对象上有一个方便的方法来获取文件名。

filename = response.headers.get_filename()  # python3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM