简体   繁体   中英

How to get the src attr in this img tag

I'm using PHP Simple HTML DOM parser and everithing runs fine until I get this div content . I've been tried all ways to get the src attr, find the a tags, the img, and all fails, I can get the img tag, but only can get the width, height and alt attr (just the part where "some text" appears, not the others strings).

<img width="656" height="370" 
alt="some text " .="" othertetx="" anothertext="" anothertext="" anothertext="" anothertext'="" title="same text in the alt attr " src="http://siteurl/getattach/somedir/somefile.aspx">

I think the problem is in the alt attr with all the text with the .= symbols that confuses the parser. This tag is displayed fine in browsers, so, it must be "standard"

Edit:

The answer pointed does not resolve the problem, I know how to get the src, the problem is with this tag. Take the time to full read the question before marking it as duplicate, please. The code provided in the sugested answer does not work with the sample I show.

This

$img_src = $element->src;
if(!strstr($img_src, 'http://')) {
    $img_src = $v . $img_src;
}

don't extract the src attr from this

<img width="656" height="370" 
    alt="some text " .="" othertetx="" anothertext="" anothertext="" anothertext="" anothertext'="" title="same text in the alt attr " src="http://siteurl/getattach/somedir/somefile.aspx">

The <img> element is not valid HTML. It shows several issues with the attribute declarations. I suggest to use a validation service like the W3C online validator in order to see those errors. I've wrapped the img tag from your question into this document for validation.

However, while the <img> tag isn't valid, the DOMDocument class is able to parse it. Like this:

$string = <<<EOF
<img width="656" height="370"
alt="some text " .="" othertetx="" anothertext="" anothertext="" anothertext="" anothertext'="" title="same text in the alt attr " src="http://siteurl/getattach/somedir/somefile.aspx">
EOF;

$doc = new DOMDocument();
@$doc->loadHTML($string);

$images = $doc->getElementsByTagName('img');
echo $images->item(0)->getAttribute('src');

Output:

http://siteurl/getattach/somedir/somefile.aspx

Note that the simplehtmldom class is not as powerful as the builtin DOM extension. It was written in a time when PHP had no builtin DOM extension. In most cases it's usage can be considered deprecated nowadays.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM