How to add quotes to the src attribute of the img tag

Question

I have correct several posts on a database that look like these:

<a href="somelink.html"><img src=someimage.jpg border=1 alt="some text"></a>

So I need to:

remove the border=1 attribute (str_replace will do the job)
add quotes at the begin and end of the src attribute: src="someimage.jpg"
close the img tag with adding /> at the end of the tag

One thing I tried is to parse the dom and get the SRC source:

$doc = new DOMDocument();
    $body = $this->removeUnnecessaryTags($body);
    $doc->loadHTML($this->removeUnnecessaryTags($body));
    $imageTags = $doc->getElementsByTagName('img');

    foreach($imageTags as $tag) {
        $result[] = [ 'src' => $tag->getAttribute('src'), 'alt' => $tag->getAttribute('alt') ];
    }

I know this can be done with regex but my regex knowledge is not very good. Any ideas?

Thanks

Answer 1

All you need is to use DOMDocument features and libxml options:

$html = '<a href="somelink.html"><img src=someimage.jpg border=1 alt="some text"></a>';

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);

$result = $dom->saveXML($dom->documentElement);

echo $result;

LIBXML_HTML_NODEFDTD prevents to add automatically a DTD when the DTD is missing. LIBXML_HTML_NOIMPLIED prevents to add html and body tags when missing.

The saveXML method will save your document with an XML compliant syntax, so it solves the self-closing tags problem. $dom->documentElement is used as parameter to avoid the xml declaration that is automatically added.(*)

Whatever the method you use (saveXML or saveHTML) double quotes are used to enclose attributes automatically.

(*) This will remove an eventual DTD too, so if you want to preserve it, you can use this workaround to remove the xml declaration:

$result = $dom->saveXML();
$result = substr($result, strpos($result, "\n") + 1);

How to add quotes to the src attribute of the img tag

Question

1 answers

solution1
2 ACCPTED 2015-04-01 15:34:44

How to add quotes to the src attribute of the img tag

Question

1 answers

solution1 2 ACCPTED 2015-04-01 15:34:44

solution1
2 ACCPTED 2015-04-01 15:34:44