How does this regular expression with capturing group and backreference match in Java?

Question

I'm having hard time understanding what a certain Java regex would match:

"<(\\w+)></\\1>"

I've read through this http://docs.oracle.com/javase/tutorial/essential/regex/

But I still can't figure out what that expression would match to, especially the \\1 part. I can see that <(\\w+)> is a possessive quantifier matching any word but I don't understand why use the () which according to the tutorial are for matching a group.

As for the second part, I just don't know what \\1 would match. I tried it with

"001123344556678899".replaceAll("\\1", "");

since I thought just maybe it matches a number, but it gave me back my string as is nothing replaced.

Answer 1

It's intended to match pairs of XML/HTML tags, such as

<tag></tag>

The \\\\1 means match to the first matched group, ie the thing in the parentheses. (The double backslash is because backslashes need to be escaped in Java string literals.)

Answer 2

I think you may have misunderstood the tutorial. Anything inside () are a set, so (\\w{1})(\\w{1}) would mean you have 2 sets having 1 character in each. the \\1 , reference the first set. So it is more like this in you search and replace:

"1234234234234".replaceAll("(23)", "\\1ab")

and the result would be "123ab423ab423ab..." , \\1 returns you what you match in your first set.

Answer 3

Just refresh your understanding of regex backreferences (and capturing groups), eg here . Capturing group uses () and backreference would be replaced by data captured by referenced group.

Then use this site to test your expression and your data like this:

Regular Expression: <(\\w+)></\\1> would become a Java string "<(\\\\w+)></\\\\1>" with input like this <body></body> :

Test    Target String   matches()   replaceFirst()  replaceAll()    group(0)    group(1)

1       <body></body>   Yes         Yes             Yes             <body></body> body

How does this regular expression with capturing group and backreference match in Java?

Question

3 answers

solution1
8 ACCPTED 2012-04-19 15:19:47

solution2
1 2012-04-19 15:22:28

solution3
1 2012-04-19 16:02:53

How does this regular expression with capturing group and backreference match in Java?

Question

3 answers

solution1 8 ACCPTED 2012-04-19 15:19:47

solution2 1 2012-04-19 15:22:28

solution3 1 2012-04-19 16:02:53

solution1
8 ACCPTED 2012-04-19 15:19:47

solution2
1 2012-04-19 15:22:28

solution3
1 2012-04-19 16:02:53