简体   繁体   English

红宝石正则表达式拆分字符串。 但如果单词的一部分包含在排除列表中则不

[英]ruby regex split string by . but not if its part of an word contained in an exclusion list

I want to split sentences by a specific char but just if this char isnt used as a part of a word that is contained in an exclusion list. 我想按特定的字符拆分句子,但是即使此字符不用作排除列表中包含的单词的一部分也是如此。 For example I want to split the sentence by a fullstop "." 例如,我想用句号“”分隔句子。 but I just if its not used after "Dr" or "Prof". 但是我只是在“ Dr”或“ Prof”之后不使用它。 For example: 例如:

"Im a Dr. of Physics and my Name is Sheldon Cooper. Im working at the University of Pasadena." “我是物理学博士,我叫Sheldon Cooper。我在帕萨迪纳大学工作。”

So the regex should just split by the fullstop after "Cooper" but not after the "Dr". 因此,正则表达式应仅在“ Cooper”之后用句号分隔,而不在“ Dr”之后。

You can use negative lookbehind: 您可以在后面使用否定式:

a = "Im a Dr. of Physics and my Name is Sheldon Cooper. Im working at the University of Pasadena."
a.split(/(?<!Dr|Prof)\./)
#=> ["Im a Dr. of Physics and my Name is Sheldon Cooper", " Im working at the University of Pasadena"]

You can define titles separately. 您可以单独定义标题。 There's no other way to do that. 没有其他方法可以做到这一点。 You should set like this: Dr|Prof|Assoc 您应该这样设置:Dr | Prof | Assoc

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM