简体   繁体   English

正则表达式转换链接中的hashtag而不破坏现有的HTML代码

[英]Regex to transform hashtag in a link without breaking existing HTML code

I want to convert all the URLs in a javascript string to links, in this strings there are also words that begin with a hashtag #. 我想将javascript字符串中的所有URL转换为链接,在此字符串中还有以#标签开头的单词。

As of now I created two regex in cascade, one that creates html anchor tags based on urls and another that creates anchor tags for the hashtags (like in Twitter). 截至目前,我在级联中创建了两个正则表达式,一个基于url创建html锚标签,另一个为hashtags创建锚标签(如在Twitter中)。

I am having a lot of problems trying to parse www.sitename.com/index.php#someAnchor into the right markup. 我在尝试将www.sitename.com/index.php#someAnchor解析为正确的标记时遇到了很多问题。

content = urlifyLinks(content);
content = urlifyHashtags(content);

where the two functions are as follows: 这两个函数如下:

function urlifyHashtags(text) {
    var hashtagRegex = /^#([a-zA-Z0-9]+)/g;
    var tempText = text.replace(hashtagRegex, '<a href="index.php?keywords=$1">#$1</a>');

    var hashtagRegex2 = /([^&])#([a-zA-Z0-9]+)/g;
    tempText = tempText.replace(hashtagRegex2, '$1<a href="index.php?keywords=$2">#$2</a>');

    return tempText;
}

function urlifyLinks(inputText) {
    var replaceText, replacePattern1, replacePattern2, replacePattern3;

    replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
    replacedText = inputText.replace(replacePattern1, '<a href="$1" target="_blank">$1</a>');

    replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
    replacedText = replacedText.replace(replacePattern2, '$1<a href="http://$2" target="_blank">$2</a>');

    replacePattern3 = /(\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,6})/gim;
    replacedText = replacedText.replace(replacePattern3, '<a href="mailto:$1">$1</a>');
    return replacedText;
}

I am considering to parse the output of urlifyLinks and apply the regex to all the dom elements that are text elements on the first level, is that an ugly thing to do? 我正在考虑解析urlifyLinks的输出并将正则表达式应用于第一级文本元素的所有dom元素,这是一个难看的事情吗?

You can avoid this problem by using a single regex with a callback function replacement. 您可以通过使用具有回调函数替换的单个正则表达式来避免此问题。

For example: 例如:

function linkify(str){
    // order matters
    var re = [
        "\\b((?:https?|ftp)://[^\\s\"'<>]+)\\b",
        "\\b(www\\.[^\\s\"'<>]+)\\b",
        "\\b(\\w[\\w.+-]*@[\\w.-]+\\.[a-z]{2,6})\\b", 
        "#([a-z0-9]+)"];
    re = new RegExp(re.join('|'), "gi");

    return str.replace(re, function(match, url, www, mail, twitler){
        if(url)
            return "<a href=\"" + url + "\">" + url + "</a>";
        if(www)
            return "<a href=\"http://" + www + "\">" + www + "</a>";
        if(mail)
            return "<a href=\"mailto:" + mail + "\">" + mail + "</a>";
        if(twitler)
            return "<a href=\"foo?bar=" + twitler + "\">#" + twitler + "</a>";

        // shouldnt get here, but just in case
        return match;
    });
}

Twitler

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM