简体   繁体   English

PHP-解析的src属性 <img> 字符串标记

[英]PHP - parsing src attribute of <img> tag in string

What would be the simplest but reliable way to parse the src attribute of the first <img> tag found in an arbitrary text string, but without using any external libraries? 在不使用任何外部库的情况下,解析在任意文本字符串中找到的第一个<img>标签的src属性的最简单但可靠的方法是什么? That means to get everything that is between opening and closing " character of <img> tag's src atrribute. 这意味着获得介于<img>标记的src属性的"字符之间的所有内容。


I did this script, but it is not a reliable solution in some cases: 我做了这个脚本,但是在某些情况下它不是一个可靠的解决方案:

  $string = $item['description'];
  $arr = explode('img', $string);
  $arr = explode('src', $arr[1]);
  $arr = explode('=', $arr[1]);
  $arr = explode('>', $arr[1]);

  $pos1 = strpos($arr[0], '"')+1;
  $pos2 = strrpos($arr[0], '"')-1;

  if (!$pos1) {
    $pos1 = strpos($arr[0], "'")+1;
    $pos2 = strrpos($arr[0], "'")-1;
  }

  if ($pos1 && $pos2) { 
    $result = substr($arr[0], $pos1, $pos2); 
  }
  else { $result = null; }

If You want to get the values of all attributes of img tag, You need to make 2 regular expressions. 如果要获取img标签的所有属性的值,则需要制作2个正则表达式。

1. Get content of an img tag: 1.获取img标签的内容:

/<\s*img([^<>]+)>/
  1. Then use this regex on the captured content with function preg_match_all() 然后使用功能preg_match_all()在捕获的内容上使用此正则表达式

     /\\S+\\s*=\\s*[\\'\\"]([^\\"\\']+)[\\'\\"]/g 

Here is your answer: First, you need to make call to this regex, 这是您的答案:首先,您需要致电此正则表达式,

<img(.*?)>

Then, in order to get other attributes, you need to make another regex call to the previous result 然后,为了获取其他属性,您需要对上一个结果进行另一个正则表达式调用

"(.*?)"

尝试这个,

<img\s+src\s?\=\s?\"(https?\:\/\/[\w\.\/]+)\".*\/>

The only safest way is by using DOMDocument built-in (in PHP 5) class. 唯一安全的方法是使用内置的DOMDocument (在PHP 5中)类。 Use getElementsByTagName() , check if the length is more than 0, and grab the first item src value with getAttribute('src') : 使用getElementsByTagName() ,检查长度是否大于0,并使用getAttribute('src')获取第一项src值:

$html = "YOUR_HTML_STRING";
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$imgs = $dom->getElementsByTagName('img');
if ($imgs->length > 0) {
    echo $imgs->item(0)->getAttribute('src');
}

See this PHP demo 观看此PHP演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM