简体   繁体   中英

Regexp matching hashtags not wrapped in html tags

I want to make regexp match of hashtags started by @ or #, and not wrapped in html anchor tag. My expression: (@|#)([a-zA-Z_]+)(?!<\\/[a]) doesn't work, because in text:

<p>@john Olor it amet, consectetuer adipiscing elit. 
Aenean commodofadgfsd 
<a class="autocompletedTag" href="#" data-id="u:2">@john_wayne</a></p>

Matches @john and @john_wayne , but I don't want to match @john_wayne .

How can Ido this?

Examples

In code :

<p>@john @kate <a>@royal_baby</a> #england <a>#russia</a></p>

I want to match @john , @kate and #england , but not @royal_baby and #russia .

In this code:

<p>#sale #stack #hello <a>@batman</a> #avengers <a>#iron_man</a></p>

I want to match #sale , #stack , #hello and #avengers , but not @batman and #iron_man .

You may use the following regex:

/(<a[^>]*>.*?[@#][a-zA-Z_]+.*?<\/a>)|([@#][a-zA-Z_]+)/g

The idea is to match both cases and use a callback to filter them:

input = '<p>@john Olor it amet, consectetuer adipiscing elit.\
Aenean commodofadgfsd \
<a class="autocompletedTag" href="#" data-id="u:2">@john_wayne</a></p>\
<p>@john @kate <a>@royal_baby</a> #england <a>#russia</a></p>\
<p>#sale #stack #hello <a>@batman</a> #avengers <a>#iron_man</a></p>';

matches = new Array(); //empty array
input.replace(/(<a[^>]*>.*?[@#][a-zA-Z_]+.*?<\/a>)|([@#][a-zA-Z_]+)/g, function(all, a, result){
    if(result){ // If the second group exists
        matches.push(result); // then add it to matches
    }
});

document.getElementById('results').innerHTML = matches.join(); // Store results

Online jsfiddle

Explanation

  • [@#] : match either @ or # one time
  • [a-zA-Z_]+ : match letters and underscore one or more times
  • <a : match <a
  • [^>]*> : match anything except > zero or more times and match > at the end
  • .*?[@#][a-zA-Z_]+.*? : match what's between <a></a> ungreedy
  • <\\/a> : match the closing tag </a>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM