简体   繁体   中英

How does this regular expression with capturing group and backreference match in Java?

I'm having hard time understanding what a certain Java regex would match:

"<(\\w+)></\\1>"

I've read through this http://docs.oracle.com/javase/tutorial/essential/regex/

But I still can't figure out what that expression would match to, especially the \\1 part. I can see that <(\\w+)> is a possessive quantifier matching any word but I don't understand why use the () which according to the tutorial are for matching a group.

As for the second part, I just don't know what \\1 would match. I tried it with

"001123344556678899".replaceAll("\\1", ""); 

since I thought just maybe it matches a number, but it gave me back my string as is nothing replaced.

It's intended to match pairs of XML/HTML tags, such as

<tag></tag>

The \\\\1 means match to the first matched group, ie the thing in the parentheses. (The double backslash is because backslashes need to be escaped in Java string literals.)

I think you may have misunderstood the tutorial. Anything inside () are a set, so (\\w{1})(\\w{1}) would mean you have 2 sets having 1 character in each. the \\1 , reference the first set. So it is more like this in you search and replace:

"1234234234234".replaceAll("(23)", "\\1ab")

and the result would be "123ab423ab423ab..." , \\1 returns you what you match in your first set.

Just refresh your understanding of regex backreferences (and capturing groups), eg here . Capturing group uses () and backreference would be replaced by data captured by referenced group.

Then use this site to test your expression and your data like this:

Regular Expression: <(\\w+)></\\1> would become a Java string "<(\\\\w+)></\\\\1>" with input like this <body></body> :

Test    Target String   matches()   replaceFirst()  replaceAll()    group(0)    group(1)

1       <body></body>   Yes         Yes             Yes             <body></body> body

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM