I want to capture multiple string which match some specific patterns, For example my string is like
String textData = "#1_Label for UK#2_Label for US#4_Label for FR#";
I want to get string between two # which match with string like for UK
Output should like this if match string is UK
than
output should be 1_Label for UK
if match string is label
than
output should be 1_Label for UK, 2_Label for US
and 4_Label for FR
if match string is 1_
than
output should be 1_Label for UK
I don't want to extract data via array list and extraction should be case insensitive.
Can you please help me out from this problem?
Regards, Ashish Mishra
You can use this regex for search:
#([^#]*?Label[^#]*)(?=#)
Replace Label
with your search keyword.
Java Pattern:
Pattern p = Pattern.compile( "#([^#]*?" + Pattern.quote(keyword) + "[^#]*)(?=#)" );
If the data always is between two hashes, try a regex like this: (?i)#.*your_match.*#
where your_match
would be UK
, label
, 1_
etc.
Then use this expression in conjunction with the Pattern
and Matcher
classes.
If you want to match multiple strings, you'd need to exclude the hashes from the match by using look-around methods as well as reluctant modifiers, eg (?i)(?<=#).*?label.*?(?=#)
.
Short breakdown:
(?i)
will make the expression case insensitive (?<=#)
is a positive look-behind, ie the match must be preceeded by a hash (but doesn't include the hash) .*?
matches any sequence of characters but is reluctant, ie it tries to match as few characters as possible (?=#)
is a positive look-ahead, which means the match must be followed by a hash (also not included in the match) Without the look-around methods the hashes would be included in the match and thus using Matcher.find()
you'd skip every other label in your test string, ie you'd get the matches #1_Label for UK#
and #4_Label for FR#
but not #2_Label for US#
.
Without the relucatant modifiers the expression would match everything between the first and the last hash.
As an alternative and better, replace .*?
with [^#]*
, which would mean that the match cannot contain any hash, thus removing the need for reluctant modifiers as well as removing the problem that looking for US
would match 1_Label for UK#2_Label for US
.
So most probably the final regex you're after looks like this: (?i)(?<=#)[^#]*your_match[^#]*(?=#)
.
([^#]*UK[^#]*) for UK
([^#]*Label[^#]*) for Label
([^#]*1_[^#]*) for 1_
Try this.Grab the captures.See demo.
http://regex101.com/r/kQ0zR5/3
I have solved this problem with below pattern,
(?i)([^#]*?us[^#]*)(?=#)
Thank you so much Anubhava, VKS and Thomas for you reply.
Regards,
Ashish Mishra
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.