简体   繁体   English

正则表达式“\\ w”不处理Ruby 1.9.2中的utf-8字符

[英]Regex “\w” doesn't process utf-8 characters in Ruby 1.9.2

Regex \\w doesn't match utf-8 characters in Ruby 1.9.2. Regex \\w与Ruby 1.9.2中的utf-8字符不匹配。 Anybody faced same problem? 有人遇到同样的问题吗?

Example: 例:

/[\w\s]+/u

In my rails application.rb I've added config.encoding = "utf-8" 在我的rails application.rb中我添加了config.encoding = "utf-8"

Define "doesn't match utf-8 characters"? 定义“与utf-8字符不匹配”? If you expect \\w to match anything other than exactly the uppercase and lowercase ASCII letters, the ASCII digits, and underscore, it won't -- Ruby has defined \\w to be equivalent to [A-Za-z0-9_] regardless of Unicode. 如果你希望\\w匹配除了大写和小写ASCII字母,ASCII数字和下划线以外的任何东西,它都不会 - Ruby定义\\w等于[A-Za-z0-9_]无论如何的Unicode。 Maybe you want \\p{Word} or something similar instead. 也许你想要\\p{Word}或类似的东西。

Ref: Ruby 1.9 Regexp documentation (see section "Character Classes"). 参考: Ruby 1.9 Regexp文档 (请参阅“字符类”一节)。

You could always use something like 你总是可以使用类似的东西

[a-zA-Z0-9_ñáéíóú] 

instead of \\w 而不是\\w

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM