简体   繁体   English

如何找到文本中匹配单词的索引?

[英]How to find the index of words matched in a text?

I am extracting indexes of words that match in this regex . 我正在提取与此正则表达式匹配的单词的索引。 It is matching all the required words in the text using regex but its also matching the space left of the regex. 它使用正则表达式匹配文本中所有必需的单词,但也匹配正则表达式的剩余空间。 It's not bounding matched string in the text in the left side but it's bounding the right side of the matched string using \\b 它不限制左侧文本中的匹配字符串,而是使用\\b匹配字符串的右侧

Regex: 正则表达式:

(price|rs)?\s*(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)

Input text: 输入文本:

    This should matchprice  5.6 lacincluding price(i.e  price 5.6 lac) and rs 56 m. including rs (i.e rs 56 k  rs 56 m) .

It will match normally if there is no price or rs written for example or                   56 k or   8.8 crs.   are  correct matching but its should bound the matched string from left side as well just like its not matching sapce after end of the matched string.

It should not match the spaces left of 8.5 in this      8.5 lac ould not match eitherrs 6 lac asas there is no spaces before 5.6

How can I modify above regex to bound the matched word in the left side as well? 

You may move the \\s* into an optional non-capturing group: 您可以将\\s*移至可选的非捕获组:

(?:\b(price|rs)\s*)?(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
^^^^^^^^^^^^^^^^^^^^

See the regex demo 正则表达式演示

The (?:\\b(price|rs)\\s*)? (?:\\b(price|rs)\\s*)? pattern will match a word boundary, followed with price or rs that are followed with 0+ whitespace chars, and the whole pattern will be tried once, and the pattern is optional due to ? pattern将匹配单词边界,后跟pricers ,后跟0+空格字符,整个模式将尝试一次,并且由于? ,该模式是可选的 modifier (the whole sequence of patterns can match 1 or 0 times) 修饰符(图案的整个序列可以匹配1或0次)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM