简体   繁体   English

使用正则表达式匹配PHP中同一句子中同一单词的组合

[英]Using regex to match combination of words in the same in the same sentence in PHP

I would like to use regular expression to find certain combination of words from a phrase in php. 我想使用正则表达式从php中的短语中找到单词的某些组合。 I can't even get the regex expression part to work. 我什至不能使正则表达式部分起作用。

The sentence should match any phrase that has the words (proficient/proficiency/fluent) in (chinese/mandarin/cantonese) in the same sentence. 该句子应与在同一句子中使用(中文/普通话/广东话)单词(熟练/熟练/流利)的任何短语匹配。 So it would match "She is fluent in Chinese." 因此,它将与“她流利的中文”相匹配。 and "His proficiency in Mandarin is excellent" 和“他的普通话水平很好”

regex = (fluent)|(proficient)|(proficiency).*(chinese)|(mandarin)|(cantonese)

I can get it to match the word fluent but how to make it match both words in the same sentence before it is considered a match? 我可以使它与流利的单词匹配,但是在被认为匹配之前,如何使其与同一句子中的两个单词匹配?

Your grouping is wrong, it should be rather 您的分组是错误的,应该是

(fluent|proficient|proficiency)[^.]*(chinese|mandarin|cantonese)

[^.] ensures (naively) that the words occur within the same sentence. [^.]确保(天真的)单词出现在同一句子中。 Also, don't forget the i flag to match title-cased words like Chinese . 另外,请不要忘记i标志来匹配诸如Chinese类的带有标题的单词。

((fluent)|(proficient)|(proficiency)).*((chinese)|(mandarin)|(cantonese))

您需要在括号中加上括号,如果您还想匹配整个句子,则需要执行以下操作

[.!?].*((fluent)|(proficient)|(proficiency)).*((chinese)|(mandarin)|(cantonese)).*[.!?]

If the order doesn't matter, you could use two regexp, the first for the first group and a second to match the second group. 如果顺序无关紧要,则可以使用两个regexp,第一个用于第一组,第二个用于匹配第二个。 Than you match two times and if both hit, you got it. 比两次比赛都成功,如果两次都击中,那就成功了。

In case you're dealing with a fluent text, I would try to split it in sentences. 如果您要处理流利的文本,我会尝试将其拆分为句子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM