简体   繁体   English

从php url获取可下载二进制文件的文件名,而无需实际下载文件

[英]Get filename of downloadable binary from php url without actually downloading the file

I'm doing some webscraping with Selenium in Python and I have a link on a page that points to, eg 我正在用Python在Selenium中进行一些网络爬虫,并且我在页面上有一个指向的链接,例如

<a href="/zip.php?zipid=103">Click Here To Download</a>

Now, of course, if I click on it, my browser will immediately start downloading the file, eg myinterestingarchive.zip 现在,当然,如果单击它,我的浏览器将立即开始下载文件,例如myinterestingarchive.zip

What I'm wondering is if I can inject some JavaScript, say, that will tell me the filename myinterestingarchive.zip WITHOUT my clicking the link, because I'd like to record the filename in my program's log, and it's nowhere in the source or OuterHTML, just that php url. 我想知道的是,是否可以注入一些JavaScript,例如,它可以告诉我文件名myinterestingarchive.zip无需单击链接,因为我想将文件名记录在程序的日志中,并且它myinterestingarchive.zip源代码中或OuterHTML,只是该php网址。

If it support HEAD request that will only download http headers you can do 如果它支持仅下载http标头的HEAD请求,则可以执行

import requests

......
# set the request with selenium cookies
cookies = {c['name']: c['value'] for c in driver.get_cookies()}
response = requests.head('http://....../zip.php?zipid=103', cookies=cookies )
print(response.headers['Content-Disposition'])
# attachment; filename=zip/myinterestingarchive.zip

And yes you can do this with injected JavaScript but it more simple using requests 是的,您可以使用注入的JavaScript做到这一点,但是使用requests更简单

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM