简体   繁体   中英

Processing markup tags with java regex

I have received a text that contains some markup tags. For example:

Jane and Jack <record>went</record> to <record>cinema</record>.

My objective is to convert this sentence to:

Jane and Jack {blank} to {blank}.

When I use the following

text.replaceAll("<record>.*</record>", "{blank}");

I receive "Jane and Jack {blank}." instead of the sentence above.

What is the best way to approach this problem?

This should do it:

text.replaceAll("<record>.*?</record>", "{blank}");

Adding the ? makes the match "non greedy" so it matches the fewest number of elements instead of the most.

Also note that handling these types of replacements are best left to an XML parser, unless they are simple replacements.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM