简体   繁体   中英

Regex “\w” doesn't process utf-8 characters in Ruby 1.9.2

Regex \\w doesn't match utf-8 characters in Ruby 1.9.2. Anybody faced same problem?

Example:

/[\w\s]+/u

In my rails application.rb I've added config.encoding = "utf-8"

Define "doesn't match utf-8 characters"? If you expect \\w to match anything other than exactly the uppercase and lowercase ASCII letters, the ASCII digits, and underscore, it won't -- Ruby has defined \\w to be equivalent to [A-Za-z0-9_] regardless of Unicode. Maybe you want \\p{Word} or something similar instead.

Ref: Ruby 1.9 Regexp documentation (see section "Character Classes").

You could always use something like

[a-zA-Z0-9_ñáéíóú] 

instead of \\w

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM