I have the following list of produtcs (in a .txt file) :
#ART#NC3FX;price1
#ART#NC3FX;price2
#ART#NC3FX;price3
#ART#NC3FXX;price1
#ART#NC3FXX;price2
#ART#NC3FXX;price3
#ART#NC3FXX;price1
#ART#NC3FXX;price2
#ART#NC3FXX;price3
#ART#NC3FX-HD;price1
#ART#NC3FX-HD;price2
#ART#NC3FX-HD;price3
I'd like to get all the occurrences of the first one ( ART#NC3FX ).
Using this regular expression
@"(^|\b)#ART#NC3FX(\b|$)";
I retrieve the first three lines, which is fine, but I also get the lines for the reference #ART#NC3FX-HD .
What should I do to prevent this from happening ?
Thanks !
Your regex finds a match because the -
hyphen is not a word character, and you tell the regex engine (with \\b
) that the character after D
should be a non-word character. So, you get a match.
You may use a negative lookahead:
@"\B#ART#NC3FX(?![\w-])"
See regex demo
The \\B
will match a position at the beginning of the string or a non-word boundary, and (?![\\w-])
lookahead will fail a match if the string is followed with a word character or a hyphen. If you test independent strings replace \\B
with ^
(start of string).
Im not sure if i understand your answer correctly, but why dont you look for the first ; like:
@"^#ART#NC3FX(;|$)"
EDIT: See Avinash's Answer
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.