简体   繁体   中英

wrap HTML tags in plain string with another HTML tag

I want to wrap a HTML tag with another HTML tag in a string (so not a DOM element, a plain string). I created this function but I wonder if I could do it in one go without a forEach loop.

This is the working function:

function style(content) {
    var tempStyledContent = content;

    var imgMatches = tempStyledContent.match(/(<img.*?src=[\"'](.+?)[\"'].*?>)/g);

    imgMatches.forEach(function (imgMatch) {
        var imgTag = imgMatch;
        var imgSrc = imgMatch.match(/src\s*=\s*"(.+?)"/)[1];

        tempStyledContent = tempStyledContent.replace(imgTag,
            "<a href=\"" + imgSrc + "\" data-fancybox>" + imgTag + "</a>");
    });

    return tempStyledContent;
}

The parameter content is a string with HTML code in it. The function above outputs the same html as the input but with the (fancybox) a tags surrounding all the child img tags.

So an input string like

"<div><img src='example.jpg'/></div>"

will output

"<div><a href='example.jpg' data-fancybox><img src='example.jpg'/></a></div>"

Can anyone improve this? I know too little about regex's to make this better.

Manipulating HTML with regex is notoriously problematic . Changes that would be trivial in a DOM parser can be very difficult to create a robust regex for; and when regex fails, it fails silently, which makes errors easy to miss. When working in regex you also have to be careful to handle all possible variations in markup such as whitespace, attribute order, quoting style, tag closing style, attribute contents that resemble html but which you don't want modified, etc.

As discussed exhaustively in the comment thread below, given enough time and effort it's certainly possible to handle all of these things in regex; but it leads to a complex, difficult to maintain regex -- and most importantly it's difficult to be certain your regex accommodates every possible valid markup variation. DOM parsing handles all of this stuff automatically, and lets you work with the structured data directly instead of having to cope with all the possible variations in its string representation.

Therefore, if you need to make nontrivial changes to an HTML string, it's almost always best to convert your HTML into a true DOM tree, manipulate that using standard DOM methods, then (if necessary) convert it back into a string. Fortunately it doesn't take a lot of code to do so. Here's a simple vanilla JS demo:

 var htmlToElement = function(html) { var template = document.createElement('template'); template.innerHTML = html.trim(); return template.content.firstChild; }; var elementToHtml = function(el) { return el.outerHTML; } // Usage demo: var string = "<div>This <b>is some</b> <i>html</i><img src='http://example.com'></div>"; var foo = htmlToElement(string); // perform your DOM manipulation as needed on foo here. This would look much simpler if I wasn't so stubborn about avoiding jQuery these days, but here we are anyway: foo.querySelectorAll('img').forEach(function(img) { var link = document.createElement('a'); link.setAttribute('data-fancybox',true); link.setAttribute('href', img.getAttribute('src')); img.parentNode.insertBefore(link,img); link.appendChild(img); }); // back to a string: var bar = elementToHtml(foo); console.log(bar); 

Ok, I'm probably going to do DOM manipulation as @DanielBeck suggested. Once knouckout finished binding I will use $.wrap http://api.jquery.com/wrap/ to do my manipulation. I just hoped there was an easy way without using jquery, so if there are other suggestions please comment them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM