PHP Simple HTML DOM can't read "data-src" or "img src" without http: in path

Question

I'm working with PHP Simple HTML DOM and just discovered it can't read images from data-src attribute or <img src without http: eg; <img src="//static.mysite.com/123.jpg">

Is there any way to make it happen?

My code is:

if($htm->find('img')){
foreach($htm->find('img') as $element) {
        $raw = file_get_contents_curl($element->src);
        $im = @imagecreatefromstring($raw);
        $width = @imagesx($im);
        $height = @imagesy($im);
        if($width>500&&$height>=350){
    $hasimg = '1';
        echo '<img src=\'' .$element->src. '\'>';
        }

} // end foreach
} // end if htm

Answer 1

It works for me:

$doc = str_get_html('<img data-src="foo">');
echo $doc->find('img', 0)->getAttribute('data-src');
//=> outputs: foo

Answer 2

echo $htm->find('img', 0)->getAttribute('data-src');

Answer 3

If you're using file_get_contents_curl() as a function you defined in your code, like the one in this question , you need to set the default protocol to use for cURL:

curl_setopt($ch, CURLOPT_PROTOCOLS, CURLPROTO_HTTP);

That way, if the image src attribute has a protocol relative URL, cURL will just use HTTP.

Answer 4

Leaving out the protocol (http/https) is called "network path reference" and means that the protocol of the page the URL is embedded in should be used. This makes no sense with file_get_contents() or curl, because they are not aware of any page.

Long story short, you have to add the protocol yourself.

Try this:

$url=$element->src;
if (substr($url, 0, 2)=='//') $url='http:'.$url;
$raw=file_get_contents_curl($url);

PHP Simple HTML DOM can't read "data-src" or "img src" without http: in path

Question

4 answers

solution1
4 2014-04-14 02:02:13

solution2
1 2022-03-11 15:06:49

solution3
0 2014-04-13 17:21:17

solution4
0 2014-04-13 17:52:41

PHP Simple HTML DOM can't read "data-src" or "img src" without http: in path

Question

4 answers

solution1 4 2014-04-14 02:02:13

solution2 1 2022-03-11 15:06:49

solution3 0 2014-04-13 17:21:17

solution4 0 2014-04-13 17:52:41

solution1
4 2014-04-14 02:02:13

solution2
1 2022-03-11 15:06:49

solution3
0 2014-04-13 17:21:17

solution4
0 2014-04-13 17:52:41