简体   繁体   中英

Remove hashtag symbol js, by regex

Tried to search on the forum but could not find anything that would precisely similar to what i need. Im basically trying to remove the # symbol from results that im receving, here is the dummy example of the regex.

 let postText = 'this is a #test of #hashtags'; var regexp = new RegExp('#([^\\\\s])', 'g'); postText = postText.replace(regexp, ''); console.log(postText); 

It gives the following result

this is a est of ashtags

What do i need to change around so that it removes just the hashtags without cutting the first letter of each word

You need a backreference $1 as the replacement:

 let postText = 'this is a #test of #hashtags'; var regexp = /#(\\S)/g; postText = postText.replace(regexp, '$1'); console.log(postText); // Alternative with a lookahead: console.log('this is a #test of #hashtags'.replace(/#(?=\\S)/g, '')); 

Note I suggest replacing the constructor notation with a regex literal notation to make the regex a bit more readable, and changing [^\\s] with a shorter \\S (any non-whitespace char).

Here, /#(\\S)/g matches multiple occurrences (due to g modifier) of # and any non-whitespace char right after it (while capturing it into Group 1) and String#replace will replace the found match with that latter char.

Alternatively, to avoid using backreferences (also called placeholders) you may use a lookahead, as in .replace(/#(?=\\S)/g, '') , where (?=\\S) requires a non-whitespace char immediately to the right of the current location. If you need to remove # at the end of the string, too, replace (?=\\S) with (?!\\s) that will fail the match if the next char is a whitespace.

You might be able to use the following :

let postText = 'this is a #test of #hashtags';
postText = postText.replace(/#\b/g, '');

It relies on the fact that a #hashtag contains a word-boundary between the # and the word that follows it. By matching that word-boundary with \\b , we make sure not to match single # .

However, it might match a bit more than you would expect, because the definition of 'word character' in regex isn't obvious : it includes numbers (so #123 would be matched) and more confusingly, the _ character (so #___ would be matched).

I don't know if there's an authoritative source defining whether those are acceptable hashtags or not, so I'll let you judge whether this suits your needs.

Probably easier will be to write your own function which probably will look like this: (covers the usecase when symbol may be repeated)

  function replaceSymbol(symbol, string) {
    if (string.indexOf(symbol) < 0) {
      return string;

    while(string.indexOf(symbol) > -1) {
      string = string.replace(symbol, '');

    return string;

var a = replaceSymbol('#', '##s##u#c###c#e###ss is he#re'); // 'success is here'

You only need the #, the stuff in parens match anything else after said #

postText = postText.replace('#', '');

This will replace all #

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM