如何从内嵌JavaScript抓取网址

Question

This is repeated 240 times, each time the two sets of the last digits are different numbers, i would like a list of all the urls. 这重复了240次，每当最后一组的两组数字都是不同的数字时，我想要所有URL的列表。

So i suppose i need to find each script and then find the first "commtArr" in each script, assuming its always the first. 所以我想我需要找到每个脚本，然后在每个脚本中找到第一个“ commtArr”，并假设它始终是第一个。

Where do I even start? 我什至从哪里开始？

<script type="text/javascript">
            commArr[commArr.length] = "http://example.com/index.php?option==down&pid=123&id=389";
            commtArr[commtArr.length] = "mp3";
            commnArr[commnArr.length] = "john doe.mp3";
</script">

Answer 1

The URL is actually being inserted into commArr , not commtArr It seems commArr will only ever have the URL. 该URL实际上是插入到commArr中 ，而不是commtArr中 。看来commArr将永远只有该URL。

Assuming the script is repeated X times on the same page, you're left with a single variable with all the URLs already. 假设脚本在同一页面上重复了X次，则剩下的变量已经包含所有URL。 It's just a simple case of listing it out. 这只是列出它的简单情况。

for (i = 0; i < commArr.length; i++) { console.log(commArr[i]) }

If it's on various pages, then you may need some kind of spider bot script to go to all the pages, run a script that grabs commArr and persistently saves it. 如果它在各个页面上，则可能需要某种蜘蛛机器人脚本才能转到所有页面，运行一个可捕获commArr并永久保存的脚本。 I'm afraid I can't suggest anything for that aside from doing it manually. 恐怕除了手动操作外，我什么也不能建议。

如何从内嵌JavaScript抓取网址

问题描述

1 个解决方案

解决方案1
0 2015-05-20 06:45:32

如何从内嵌JavaScript抓取网址

问题描述

1 个解决方案

解决方案1 0 2015-05-20 06:45:32

解决方案1
0 2015-05-20 06:45:32