简体   繁体   English

Android webscrape - 当网站使用javascript下载功能时,如何从网站下载文件

[英]Android webscrape - How to download a file from a site when the site uses a javascript download function

I am using Jsoup in my app to scrape data from a site. 我在我的应用程序中使用Jsoup来从网站上抓取数据。 Everything was fine until I came across the 'download' part of the app. 一切都很好,直到我遇到应用程序的“下载”部分。 It would be easy if the download link is in the href value but this site uses a javascript function 如果下载链接在href值中,但此站点使用javascript函数,则会很容易

Here's how the site is layed out: 该网站的布局如下:

This is the link to the file: 这是文件的链接:

<a href="javascript:download(11848,'d915f46123');">Ai Ai Ai ni Utarete Bye Bye Bye</a>

Below is the javascript download function. 以下是javascript下载功能。 It accepts a songid and a key, builds a string with the passed arguments and sets it as the form's action attribute, and calls the form's submit method: 它接受一个songid和一个键,用传递的参数构建一个字符串并将其设置为表单的action属性,并调用表单的submit方法:

function download(songId, key) {
var form = document.getElementById('dlForm'); form.action = '/download/zephzeph/' + key + '/' + songId + '.mp3';
form.submit();
}

Below is the form: 以下是表格:

<form id="dlForm" action="/amusic/download.php" method="POST"></form>

Hopefully I understand your question correctly, do elaborate, if not. 希望我能正确理解你的问题,如果没有,请详细说明。

I would try to right click on the download link and open it in a new tab. 我会尝试右键单击下载链接并在新选项卡中打开它。 The new link is what you need to emulate in your scraper. 您需要在刮刀中模拟新链接。

My experience with net scraping is very limited, but I would be more than happy to help you find a solution. 我在网络抓取方面的经验非常有限,但我非常乐意帮助您找到解决方案。 :) :)

Looks like the only way I can achieve this is to use a webview. 看起来我能实现这一目标的唯一方法是使用webview。 http://android-er.blogspot.com/2011/10/call-javascript-inside-webview-from.html?m=1 . http://android-er.blogspot.com/2011/10/call-javascript-inside-webview-from.html?m=1 I'll try and see if it will work. 我会试着看看它是否有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM