Word boundary regexp in JavaScript

Question

Let's suppose I have the following string:

bla bla "some" bla bla some bla bla something

I would like to replace all occurences of 'some' bounded non-word symbols with ''. I wrote a regular expression for this purpose:

/^|[^0-9a-zа-я](some)[^0-9a-zа-я]|$/gi

How I use it:

'bla bla "some" bla bla some bla bla something'.replace(/^|[^0-9a-zа-я](some)[^0-9a-zа-я]|$/gi, '<$1>')

And its result is

<>bla bla <some> bla bla<some>bla bla something<>

But I expected

bla bla "<some>" bla bla <some> bla bla something

How could I fix this regex? As I know JavaScript's regular expressions don't support named groups.

Note: I can not use \\b because words I want to match contain cyrillic symbols and \\b in Javascript's regex engine doesn't work properly with non-latin letters.

Answer 1

You could use something along those lines :

yourString.replace(/(^|[^0-9a-zа-я])(some)(?![0-9a-zа-я])/gi, '$1<$2>')

Try it online.

Note that as Wiktor Stribiżew comments on another answer, your character class only matches the basic Cyrillic alphabet and would miss other Cyrillic characters. An alternative would be to stop using a negated character class and instead match characters you expect as word separators if they are easier to enumerate. In that optic ["\\s] would appear to be a good start :

yourString.replace(/(^|[\s"])(some)(?![^\s"])/gi, '$1<$2>')

Try it online.

Answer 2

Group and capture the opening and closing alternatives and include these captures in the replacement string too:

 var regex = /(^|[^0-9a-zа-яё])(some)([^0-9a-zа-яё]|$)/gi; var output = 'bla bla "some" bla bla some bla bla something'.replace(regex, '$1<$2>$3'); console.log(output);

Word boundary regexp in JavaScript

Question

2 answers

solution1
1 2017-06-23 09:29:24

solution2
1 2017-06-23 09:29:39

Word boundary regexp in JavaScript

Question

2 answers

solution1 1 2017-06-23 09:29:24

solution2 1 2017-06-23 09:29:39

solution1
1 2017-06-23 09:29:24

solution2
1 2017-06-23 09:29:39