简体   繁体   English

Ruby 2.0正则表达式和西里尔文

[英]Ruby 2.0 regex and cyrillic

Before ruby 2.0, regex worked this way: 在ruby 2.0之前,正则表达式就是这样的:

/\A[a-zа-я\d]+\z/i          =~ 'привет' # => 0
/\A[a-z\p{Cyrillic}\d]+\z/i =~ 'привет' # => 0

I updated ruby 2.0, and it has a bug: 我更新了ruby 2.0,它有一个bug:

/\A[a-zа-я\d]+\z/i          =~ 'привет' # => nil
/\A[a-z\p{Cyrillic}\d]+\z/i =~ 'привет' # => nil

How can I deal with this problem? 我该如何处理这个问题? Without \\d in the character class, it works correctly: 如果没有\\d在字符类中,它可以正常工作:

/\A[a-zа-я]+\z/i            =~ 'привет' # => 0

This bug looks similar and may be related to this bug that I asked about before. 这个bug看起来很相似,可能与我之前询问的这个bug有关。 I reported it to ruby trunk , and it has been accepted as a bug. 把它报告给ruby trunk ,它已被接受为bug。 Hopefully, it will be fixed. 希望它会被修复。

The bug seems to be fixed in ruby-head : 这个bug好像是在ruby-head修复的:

⮀ rvm use ruby-2.0.0-preview2
Using /home/am/.rvm/gems/ruby-2.0.0-preview2
⮀ irb
2.0.0dev :001 > regex = /\A[a-zа-я\d]+\z/i ; regex =~ 'привет'
# ⇒ nil 
⮀ rvm use ruby-2.0.0-preview1
Using /home/am/.rvm/gems/ruby-2.0.0-preview1
⮀ irb
2.0.0dev :001 > regex = /\A[a-zа-я\d]+\z/i ; regex =~ 'привет'
# ⇒ nil 
⮀ rvm use ruby-head
Using /home/am/.rvm/gems/ruby-head
⮀ irb
irb(main):001:0> regex = /\A[a-zа-я\d]+\z/i ; regex =~ 'привет'
# ⇒ 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM