Regex: Replace foo if it's a word or inside a URL

Question

Given this:

str = "foo myfoo http://thefoobar.com/food is awesome";
str.replace(magicalRegex, 'bar');

The expected result is:

"bar myfoo http://thebarbar.com/bard is awesome"

I get the \\b(foo)\\b part, but I can't figure out how to match and capture foo from within a url. For these purposes, assume urls always start with http .

Any help?

Answer 1

You can use this code (works well with your example but haven't tried with more complex inputs):

str = 'foo myfoo http://thefoobar.com/food is awesome';
str = str.replace(/\bfoo\b/g, 'bar');
while (/http:\/\/[^\s]*?foo/.test(str))
    str = str.replace(/(http:\/\/[^\s]*?)?foo/g, function($0, $1) {
        return $1 ? $1 + 'bar' : $0;
    });
console.log(str);

OUTPUT:

bar myfoo http://thebarbar.com/bard is awesome

Live Demo: http://ideone.com/8xGy2h

Answer 2

I think you are going to have to do go multi-step to get this done right. Basically you are doing two separate (albeit, similar) regex replacements here:

a global replacement of the character group "foo", if it occurs within a link, and
a global replacement of the word "foo" in the rest of the string.

This code would run through both steps separately (URL first, rest of the string second) and give the final replacement:

var urlPattern = /(http:\/\/[^\s]+)/;
var urlFooPattern = /(foo)/g;
var globalFooPattern = /\b(foo)\b/g;

var str = "foo myfoo http://thefoobar.com/food is awesome";

var urlString = str.match(urlPattern)[0];
urlString = urlString.replace(urlFooPattern, "bar");

str = str.replace(urlPattern, urlString);

str = str.replace(globalFooPattern, "bar");

Note: this assumes that there is only one URL in the string . . . to handle the possibility of multiple URLs would be a good bit more complicated:

capture all of the URLs using var urlString = str.match(urlPattern) in an array
creating a new array by looping through each URL and doing an individually "foo replace" on each
Looping through the original array of matches and using those as the patterns to be replaced by the updated values in the second array

looping through all of the URLs returned by var urlString = str.match(urlPattern) , replacing "foo" in them individually, and looping through again then replacing them in the original string one at a time.

Answer 3

If you want to use boundaries to only match "foo" but not "myfoo", you'll need to use an or operation ( | )to match the urls--by necessity, if "foo" is included in the middle of a url, it will not be surrounded by word boundaries.

Something like this should work for you:

\b(foo)\b | http\S*(foo)\S*

You can run further tests here if needed.

EDIT: Apologies, I thought the OP was looking to capture those words and URLs. Look-behind regexes that won't capture the root of the URL for replacement aren't innately supported in JS as far as I know, but can frequently be duplicated with a simple function, see here for a discussion of how to do so .

Regex: Replace foo if it's a word or inside a URL

Question

3 answers

solution1
4 ACCPTED 2013-03-12 20:51:10

OUTPUT:

Live Demo: http://ideone.com/8xGy2h

solution2
0 2013-03-12 20:44:14

solution3
-1 2013-03-12 19:48:21

Regex: Replace foo if it's a word or inside a URL

Question

3 answers

solution1 4 ACCPTED 2013-03-12 20:51:10

OUTPUT:

Live Demo: http://ideone.com/8xGy2h

solution2 0 2013-03-12 20:44:14

solution3 -1 2013-03-12 19:48:21

solution1
4 ACCPTED 2013-03-12 20:51:10

solution2
0 2013-03-12 20:44:14

solution3
-1 2013-03-12 19:48:21