简体   繁体   English

PHP:如何在html中(从url)查找和提取具有src属性的元素

[英]PHP: How to find and extract an element with src attribute in html (from url)

I am currently using PHP's curl request to fetch content from a URL. 我目前正在使用PHP的curl请求从URL获取内容。 After getting the contents I need to inspect the given HTML chunk, find a 'video' that has a given style attribute and extract their source src values text. 获取内容后,我需要检查给定的HTML块,找到具有给定样式属性的“视频”,并提取其源src值文本。 Currently I get the page but how I can get this value? 目前,我可以获取页面,但是如何获取此值? Here is my code to get the page: 这是我获取页面的代码:

<?php
$Url = 'some site';

if (!function_exists('curl_init')){
    die('CURL is not installed!');
}
$ch = curl_init($Url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // add this one, it seems to spawn redirect 301 header
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13'); // spoof
$output = curl_exec($ch);
curl_close($ch);

echo $output;

The code above is working and output the page. 上面的代码正在工作并输出页面。 Then in the page's output I inspect elements and I found this: 然后,在页面的输出中,我检查了元素,发现了这一点:

<div class="webstarvideo">
  <video style="width:100%;height:100%" preload="none" class="">
    <source src="I NEED THIS" type="video/mp4"></video>
  <div class="webstarvideodoul">
    <canvas></canvas>
  </div>
</div>

I need the src of the video in the above code, how can I do that? 我需要上述代码中视频的src,该怎么办?

At PHP level : 在PHP级别:

You can use a regex with preg_match or use the PHP DOMDocument class : 您可以使用带有preg_match的正则表达式或使用PHP DOMDocument类:

DOM DOM

$doc = new DOMDocument();
$doc->loadHTML($output);
$videoSource = $doc->getElementsByTagName('source');

echo $videoSource->getAttribute('src');

With REGEX 使用REGEX

$array = array();
preg_match("/source src=\"([^\"]*)\" type=\"video\/mp4\">/i", $output, $array);
echo $array[1];

If you want to get the video's SRC as a PHP variable, you need to extract it from the string, by checking where "type" is: 如果要将视频的SRC作为PHP变量获取,则需要通过检查“类型”在哪里从字符串中提取它:

$output = '<div class="webstarvideo">
  <video style="width:100%;height:100%" preload="none" class="">
    <source src="I NEED THIS" type="video/mp4"></video>
  <div class="webstarvideodoul">
    <canvas></canvas>
  </div>
</div>';

$type_position = strpos($output, "type=");
$video_src = substr($output, 110, $type_position - 112);
echo $video_src; // I NEED THIS

110 in the above example is the number of characters up to and including the left double-quote in the SRC attribute, and 112 is an additional two characters to compensate for the right double-quote and the space before type . 上例中的110是SRC属性中包含左双引号的字符数,而112是表示右双引号和type之前的空格的另外两个字符。

Hope this helps! 希望这可以帮助! :) :)

With PHP, you can use Simple HTML DOM Parser to do this, query syntax like jQuery. 借助PHP,您可以使用简单HTML DOM解析器来执行此操作,并查询类似于jQuery的语法。

$Url = 'some site';

if (!function_exists('curl_init')){
    die('CURL is not installed!');
}
$ch = curl_init($Url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // add this one, it seems to spawn redirect 301 header
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13'); // spoof
$output = curl_exec($ch);
curl_close($ch);

$html = str_get_html($output);

$video = $html->find('video', 0);
$videoSrc = $video->src;
var_dump($videoSrc);

Assuming that $output is the complete text, you can regex is using... 假设$output是完整的文本,您可以正则表达式使用...

preg_match_all("/(?<=\<source).*?src=\"([^\"]+)\"/", $output, $all);

print_r($all[1]); // all the links will be in this array

Use document.querySelector() to point your element.Then get the src attribute using document.getAttribute() . 使用document.querySelector()指向您的元素,然后使用document.getAttribute()获得src属性。

 var video = document.querySelector('.webstarvideo video source'); console.log(video.getAttribute('src')); 
 <div class="webstarvideo"> <video style="width:100%;height:100%" preload="none" class=""> <source src="I NEED THIS" type="video/mp4"></video> <div class="webstarvideodoul"> <canvas></canvas> </div> </div> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM