简体   繁体   English

Python:从javascript按钮获取下载链接

[英]Python: Get download link from javascript button

I am trying to get my script to download subtitles from www.subscene.com. 我正在尝试让我的脚本从www.subscene.com下载字幕。 The problem is that the download button on webpage is made in java, and for some reason i cannot download subtitles even if i extract the URL. 问题是网页上的下载按钮是用Java制作的,由于某种原因,即使我提取了URL,我也无法下载字幕。

I think this is the code for the download button: 我认为这是下载按钮的代码:

<a id="s_lc_bcr_downloadLink" class="downloadLink rating0" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;s$lc$bcr$downloadLink&quot;, &quot;&quot;, true, &quot;&quot;, &quot;/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx&quot;, false, true))">Download English Subtitle</a><a id="s_lc_bcr_previewLink" href="javascript:togglePreview(482407, 'zip');">(See preview)</a>

so i extract the url and tell my script to download it: 因此,我提取了网址并告诉我的脚本进行下载:

urllib.urlretrieve('http://subscene.com/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx','c:\\sub.zip')

(Added 'http://subscene.com') (添加了“ http://subscene.com”)

But for some reason it doesnt download the right file. 但是由于某种原因,它没有下载正确的文件。 What am i supposed to do? 我应该做些什么?

EDIT: 编辑:

Thanks a lot! 非常感谢! unfortunately i cant get it to work :( it says the following 不幸的是,我不能让它工作:(它说以下

from selenium import webdriver

browser = webdriver.Firefox()
browser.execute_script('WebForm_DoPostBackWithOptions(newWebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
browser.execute_script('WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 385, in execute_script{'script': script, 'args':converted_args})['value']
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 153, in execute
self.error_handler.check_response(response)
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\errorhandler.py", line 126, in check_response
raise exception_class(message, screen, stacktrace) 
WebDriverException: Message: ''

As John said this is not the file but javascript code. 正如约翰所说,这不是文件,而是javascript代码。 So instead of getting that file using urllib.urlretrieve, you can execute the javascript which downloads the files in turn. 因此,您可以执行依次下载文件的javascript,而不是使用urllib.urlretrieve获取该文件。 This can be done using selenium module - 这可以使用硒模块来完成-

from selenium import webdriver
browser = webdriver.Firefox()
browser.get('http://subscene.com/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407.aspx')        
browser.execute_script('WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
raw_input()

I got this javascript snippet using Firebug. 我使用Firebug得到了这个javascript代码段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM