简体   繁体   English

如何制作regexp我不需要匹配/可选匹配的Ruby?

[英]How do I make regexp i Ruby that doesnt require matches / optional matches?

Im trying to create a regexp that matches parts of some strings. 我试图创建一个匹配某些字符串部分的正则表达式。 I doesnt have to match everypart, but at least one ( which it always will ) 我不必匹配每个部分,但至少有一个(它总是会)

I want: Name and Year and / or Season/Episode. 我想要:名字和年份和/或季节/剧集。

Lets say I have these strings: 让我们说我有这些字符串:

  1. i.want.this.as.name.2014.s01e02 i.want.this.as.name.2014.s01e02
  2. i still want a this 2010 我还想要一个2010年
  3. i also want this 我也想要这个
  4. I still want this.S05E23.720p.HDTV.X264 我还是想要这个.05E23.720p.HDTV.X264

I would like to get these matches: 我想得到这些比赛:

1. 
name =  i.want.this.as.name.
year =  2014
seasonepisode =     s01e02
season =    01
episode =   02
2.
name = i still want a this
year = 2010
3.
name = i also want this
4.
name =  I still want this
seasonepisode =     s05e23
season =    05
episode =   23

Right now, i have this regexp: 现在,我有这个正则表达式:

(?<name>.*)(?<year>\d{4})(\s|\.|\z)*(?<seasonepisode>s(?<season>\d{1,2})e(?<episode>\d{1,2}))*

But I only get the desired result on the first string. 但我只在第一个字符串上得到了所需的结果。 I guess that is because there are no matches for the full regexp in the string 2, 3 or 4. 我猜这是因为字符串2,3或4中的完整正则表达式没有匹配项。

Here you can try the regexp: http://rubular.com/r/1ypseJ7c6I 在这里你可以尝试正则表达式: http//rubular.com/r/1ypseJ7c6I

So my question is, how do I tell the regexp that i dont require matches on everything, just something? 所以我的问题是,我怎么告诉正则表达式我不需要对所有东西进行匹配,只是一些东西? :-) I have tried added asterix to the opitonal. :-)我尝试将asterix添加到opitonal。

5€ donation to a project / charity of your choice for the correct answer :-) 5€捐赠给您选择的项目/慈善机构以获得正确答案:-)

This might work: http://rubular.com/r/4qYuzGGqaB . 这可能有效: http//rubular.com/r/4qYuzGGqaB Using /ix options, the latter for readability. 使用/ix选项,后者是为了便于阅读。

^
(?<nm>.+?)        # Name: at least one character, non-greedy.
(?<yr>\d{4})?     # Year, optional.
(?:               # Post-year stuff, non-captured.
  [\s\.]
  s(?<se>\d\d?)   # Season.
  e(?<ep>\d\d?)   # Episode.
  (?<rest>.*)     # The rest, optional.
)?                # Post-year stuff is optional.
$                 # Must consume full line.

A couple of notes: 几个笔记:

  • The non-greediness of the name-group is important. 名字组的非贪婪很重要。 Otherwise, it will happily consume the entire line (everything else is optional). 否则,它将很乐意消耗整行(其他一切都是可选的)。

  • Requiring a full line match is also important. 要求全线匹配也很重要。 Otherwise, the pattern will happily match only the first letter of the line (the name is non-greedy, everything else is optional). 否则,模式将很乐意只匹配行的第一个字母(名称是非贪婪的,其他一切都是可选的)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM