简体   繁体   English

红宝石正则表达式提取单引号之间的单词

[英]ruby regex extract word between single quotes

I'm looking for a regex to match: 我正在寻找匹配的正则表达式:

ciao: c'iao 'ciao'

with: 有:

ciao #every word excluding non-word character
c'iao #including apostrophes
ciao #excluding the quotes ''

So far I've been able to match the first 2 requirements with: 到目前为止,我已经能够将前2个要求与以下要求相匹配:

/[\w']+/

but I'm struggling with extracting word between single quotes (w/o including the quotes). 但我正在努力在单引号之间提取单词(不包括引号)。 Note that I won't have a case where a word with apostrophe is included between quotes (like 'c'iao') 请注意,我不会在引号之间包含带撇号的单词(如'c'iao')

I've seen many similar Q&A but couldn't find any suiting my needs; 我见过许多类似的问答,但找不到任何适合我的需求; Extra points for an answer that includes a brief explanation :) 一个答案的额外点,包括一个简短的解释:)

You can use the following expression: 您可以使用以下表达式:

/\w+(?:'\w+)*/

See the Rubular demo 请参阅Rubular演示

The expression matches: 表达式匹配:

  • \\w+ - 1 or more word chars \\w+ - 1个或更多单词字符
  • (?:'\\w+)* - zero or more sequences (as (?:...)* is a non-capturing group that groups a sequence of subpatterns quantified with * quantifier matching 0 or more occurrences) of: (?:'\\w+)* - 零个或多个序列(as (?:...)*是一个非捕获组,它将一组子模式序列分组,用*量词匹配0次或多次出现):
    • ' - apostrophe ' - 撇号
    • \\w+ - 1 or more word chars. \\w+ - 1个或更多单词字符。

See a short Ruby demo here : 在这里看一个简短的Ruby演示

"ciao: c'iao 'ciao'".scan(/\w+(?:'\w+)*/)
# => [ciao, c'iao, ciao]

Considering that words can begin or end with an apostrophe, or contain multiple apostrophes, I suggest first splitting on whitespace then removing pairs of single quotes that enclose words. 考虑到单词可以以撇号开头或结尾,或者包含多个撇号,我建议首先拆分空白然后删除包含单词的单引号对。

str = "'Twas because Bo didn't like Bess' or y'all's 'attitude'"

str.split.map { |s| s =~ /\A'.+'\z/ ? s[1..-2] : s }
  #=> ["'Twas", "because", "Bo", "didn't", "like", "Bess'", "or", "y'all's", "attitude"]

The first step produces 第一步产生

arr = str.split
  #=> ["'Twas", "because", "Bo", "didn't", "like", "Bess'", "or", "y'all's", "'attitude'"]

The regex matches elements of arr that begin and end with a single quote. 正则表达式匹配以单引号开头和结尾的arr元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM