简体   繁体   English

两个字符串的正则表达式格式

[英]regex format for two strings

I want a regular expression that will extract the words happy and good, both non greedy and both case insensitive. 我想要一个正则表达式,它将提取“ happy and good”(非贪婪且不区分大小写)。

@a = [" I am very HAppy!!", "sad today..", "happy. to hear about this..", "the day is good", "sad one", "sad story"]

It looks like this works with one word: 看起来这只用一个词就可以了:

@z = @a.join.scan(/\bhappy\b/i)

But when I add in good it does not work as I expect. 但是,当我添加好时,它不能按我预期的那样工作。

@z = @a.join.scan(/\bhappy|good\b/i) 

Expect ( happy 2x and good 1x): 期望(快乐2倍,好1倍):

@z.size => 3

The result it gives me: 结果给了我:

@z.size => 2

You should add parentheses around your alternation so that the \\b s will apply to either happy or good as a unit: 你应该加括号的左右交替,使得\\b旨意适用于任何happygood的单位:

\b(happy|good)\b

Then, you probably want to scan each element of the @a array rather than @a.join so a map and flatten are called for: 然后,您可能想扫描@a数组的每个元素而不是@a.join因此需要mapflatten

@a.map { |s| s.scan(/\b(happy|good)\b/i) }.flatten
# ["HAppy", "happy", "good"]

You could also use a non-capturing group: 您也可以使用非捕获组:

\b(?:happy|good)\b

but it won't make any difference in this case. 但在这种情况下不会有任何区别。

I assume you mean it matches both happy, but not good. 我认为您的意思是说既快乐,但不好。 This is because your looking at word boundaries, and when you join the string it becomes goodsad. 这是因为您查看单词边界,并且当您连接字符串时,它会变成商品。

Remove the word boundary conditions and it should match as expected. 删除单词边界条件,它应符合预期。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM