简体   繁体   English

如何从内嵌JavaScript抓取网址

[英]How do I go about scraping a url from inline javascript

This is repeated 240 times, each time the two sets of the last digits are different numbers, i would like a list of all the urls. 这重复了240次,每当最后一组的两组数字都是不同的数字时,我想要所有URL的列表。

So i suppose i need to find each script and then find the first "commtArr" in each script, assuming its always the first. 所以我想我需要找到每个脚本,然后在每个脚本中找到第一个“ commtArr”,并假设它始终是第一个。

Where do I even start? 我什至从哪里开始?

<script type="text/javascript">
            commArr[commArr.length] = "http://example.com/index.php?option==down&pid=123&id=389";
            commtArr[commtArr.length] = "mp3";
            commnArr[commnArr.length] = "john doe.mp3";
</script">

The URL is actually being inserted into commArr , not commtArr It seems commArr will only ever have the URL. 该URL实际上是插入到commArr中 ,而不是commtArr中 。看来commArr将永远只有该URL。

Assuming the script is repeated X times on the same page, you're left with a single variable with all the URLs already. 假设脚本在同一页面上重复了X次,则剩下的变量已经包含所有URL。 It's just a simple case of listing it out. 这只是列出它的简单情况。

for (i = 0; i < commArr.length; i++) { console.log(commArr[i]) } 

If it's on various pages, then you may need some kind of spider bot script to go to all the pages, run a script that grabs commArr and persistently saves it. 如果它在各个页面上,则可能需要某种蜘蛛机器人脚本才能转到所有页面,运行一个可捕获commArr并永久保存的脚本。 I'm afraid I can't suggest anything for that aside from doing it manually. 恐怕除了手动操作外,我什么也不能建议。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在AngularJS中,如何将图像链接到url? - In AngularJS, how do I go about linking an image to an url? 我如何去延迟 javascript 执行? - How do I go about delaying javascript execution? 我如何 go 关于将 javascript animation 添加到 a.ZAE8EB96DF05E788AC39ZD88948EAF295 图像? - How do I go about adding a javascript animation to a .svg image? 我如何解决这个算法? - How do I go about solving this algorithm? 我如何 go 关于创建存储唯一对象计数的 javascript object ? - How do I go about creating a javascript object that stores the count of unique objects? 如何使用纯Javascript(无JQuery)将文本主体动态拆分为两个偶数列? - How do I go about dynamically splitting a body of text into two even columns with plain Javascript (no JQuery)? 我如何调用 API 并在 javascript 中使用数据库检查用户名和密码? - How do I go about calling API and check username and password with Database in javascript? 如何创建使用本地存储的javascript ipad应用程序? - How do I go about creating a javascript ipad app that uses local storage? 如何将常规的javascript文件集成到我的vue.js页面中? - How do I go about integrating a regular javascript file into my vue.js page? 我应该如何从多个站点获取最近十篇博客文章的标题,日期和URL? - How would I go about getting the title, date, and url of the last ten blog posts from multiple sites?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM