简体   繁体   English

JS部分中的DOM HTML和JS抓取

[英]DOM HTML & JS scrape from JS part

I want to scrape links from one page to another with HTML DOM pharser. 我想使用HTML DOM相位器抓取从一个页面到另一页面的链接。

The other webpage has this code : 另一个网页具有以下代码:

$('#vidabc-fast-watch-button').click(function() {
  $('#fast-watch-frame').attr('src','http://vidabc.com/embed-8fyiakzp0ob8.html');
});                     
$('#kingvid-fast-watch-button').click(function() {

  $('#vidwatch-fast-watch-button').click(function() {
    $('#fast-watch-frame').attr('src','');
  });
  $('#estream-fast-watch-button').click(function() {
    $('#fast-watch-frame').attr('src','http://estream.to/embed-2605th4kkypl.html');
  });
  $('#openload-fast-watch-button').click(function() {
    $('#fast-watch-frame').attr('src','http://openload.co/embed/YsaOx8K5Bk0/');
  });

I want to scrape information to another PHP page and preg_match the url. 我想将信息抓取到另一个PHP页面并preg_match该URL。 But couldn't find links inside JS code. 但是找不到JS代码内部的链接。

Any idea? 任何想法?

You could match the URLs inside the script by looking at the text content of the script tag, and launch a preg_match_all on it: 您可以通过查看script标记的文本内容来匹配脚本中的URL,然后在其上启动preg_match_all

$scr = $doc->getElementsByTagName('script')[0]->textContent;
preg_match_all("/http:[\w#\[\]@!$&()*+,;=%:\/.?~-]*/", $scr, $urls);

print_r($urls[0]);

For the given example this would output: 对于给定的示例,将输出:

Array
(
    [0] => http://vidabc.com/embed-8fyiakzp0ob8.html
    [1] => http://estream.to/embed-2605th4kkypl.html
    [2] => http://openload.co/embed/YsaOx8K5Bk0/
)

See it run on eval.in 看到它在eval.in运行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM