In the following example sentence:
Green shirt green hat
Is it possible to use regex to detect 2 identical words and replace the second with and
to become:
Green shirt and hat
A more difficult string example. Here the first of the identical words needs to be replaced:
You are an artistically gifted musically gifted individual
Should become:
You are an artistically and musically gifted individual
First off, regex isn't the most ideal solution for this, but I'm sure you have your reasons for using it.
((\b[a-z]{1,}\b).*?)(\b\2\b)(.*)$
Replace with: \\1and\\4
This regex will find two identical words in a string and replace the second one with and
.
Live Demo
https://regex101.com/r/yG3yM6/2
Sample text
Green shirt green hat
Green shirt greenish hat
You are an artistically gifted musically gifted individual
Sample Matches
Green shirt and hat
Green shirt greenish hat
You are an artistically gifted musically and individual
NODE EXPLANATION
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
[a-z]{1,} any character of: 'a' to 'z' (at least
1 times (matching the most amount
possible))
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
\2 what was matched by capture \2
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
( group and capture to \4:
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
) end of \4
----------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
----------------------------------------------------------------------
Although not addressed in the OP, if the words in question use non az
characters, then you could replace [az]
with [az]|[^\\x00-\\x7F]
which will match non-english characters. But then we'll need to change the \\b\\2\\b
to (?<=\\s|^)\\2(?=\\s|$)
so we can ensure correct matching.
((\b(?:[a-z]|[^\x00-\x7F]){1,}\b).*?)((?<=\s|^)\2(?=\s|$))(.*)$
Live Demo https://regex101.com/r/wD8yF5/2
By modifying this answer , you can do it:
console.log( myFunc("Green shirt green hat") ); console.log( myFunc("Big red eyed rabbits red Ferrari") ); function myFunc(str) { return str.replace(/\\b(\\w+)(.+)(\\1)\\b/gi, "$1$2and"); }
You can use RegExp
/(\\bgreen\\b)/ig
, where green
is word to match, String.prototype.replace()
, when p2
is reached within replacement function
p1
,p2
, ... Then
th parenthesized submatch string, provided the first argument to replace() was aRegExp
object. (Corresponds to$1
,$2
, etc. above.) For example, if/(\\a+)(\\b+)/
, was given,p1
is the match for\\a+
, andp2
for\\b+
.
replace green
with and
var str = "Green shirt green hat green"; var re = function(m, p1, p2, index) { return p2 ? "and" : m } str = str.replace(/(\\bgreen\\b)/ig, re); console.log(str);
You can use the following:
/(\b([^\s]+)\b.*?)\b\2\b/gi
Test case:
var regex = /(\b([^\s]+)\b.*?)\b\2\b/gi;
'Green shirt green hat with blue shoes blue glasses'.replace(regex, '$1and')
=== 'Green shirt and hat with blue shoes and glasses';
'Orange colored oranges orange belts'.replace(regex, '$1and')
=== 'Orange colored oranges and belts';
The answer to your first example - which I read as replace the second of the first repeated word with 'and' - is:
var str = 'Green shirt green hat'; str = str.replace(/(\\b\\S+\\b)(.+?)(\\b\\1\\b)/i, '$1$2and'); console.log(str);
The answer to your second example - which I read as replace the first repeated word with 'and' - is:
var str = 'You are an artistically gifted musically gifted individual'; str = str.replace(/(\\b\\S+\\b)(.+?)(\\b\\1\\b)/i, 'and$2$1'); console.log(str);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.