简体   繁体   中英

How to find the index of words matched in a text?

I am extracting indexes of words that match in this regex . It is matching all the required words in the text using regex but its also matching the space left of the regex. It's not bounding matched string in the text in the left side but it's bounding the right side of the matched string using \\b

Regex:

(price|rs)?\s*(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)

Input text:

    This should matchprice  5.6 lacincluding price(i.e  price 5.6 lac) and rs 56 m. including rs (i.e rs 56 k  rs 56 m) .

It will match normally if there is no price or rs written for example or                   56 k or   8.8 crs.   are  correct matching but its should bound the matched string from left side as well just like its not matching sapce after end of the matched string.

It should not match the spaces left of 8.5 in this      8.5 lac ould not match eitherrs 6 lac asas there is no spaces before 5.6

How can I modify above regex to bound the matched word in the left side as well? 

You may move the \\s* into an optional non-capturing group:

(?:\b(price|rs)\s*)?(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
^^^^^^^^^^^^^^^^^^^^

See the regex demo

The (?:\\b(price|rs)\\s*)? pattern will match a word boundary, followed with price or rs that are followed with 0+ whitespace chars, and the whole pattern will be tried once, and the pattern is optional due to ? modifier (the whole sequence of patterns can match 1 or 0 times)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM