简体   繁体   English

Ruby 1.8与UTF-8字符串大小写比较

[英]Ruby 1.8 and UTF-8 string case statement compare

I have a Rake task (in lib/tasks directory) that I run with cron on my shared web hosting. 我有一个Rake任务(在lib / tasks目录中),该任务在共享的Web主机上使用cron运行。 The problem is that I want to compare a UTF-8 string using case statment but my source code is not UTF-8 encoded. 问题是我想使用大小写语句比较UTF-8字符串,但是我的源代码不是UTF-8编码的。 If I save source code as UTF-8 there is error when I try to start it :( 如果我将源代码另存为UTF-8,则尝试启动它时会出错:(

What I have to do? 我该怎么办?

May be read this strings from external UTF-8 txt file? 可以从外部UTF-8 txt文件读取此字符串吗?

PS I'm using Ruby 1.8 PS我正在使用Ruby 1.8

PS I mean compare this way: 附言:我的意思是这样比较:

result = case utf8string
   when 'АБВ': 1
   when 'ГДИ': 2
   when 'ЙКЛ': 3
   when 'МНО': 4
   else 5
end

I found that my problem was not in case statment 我发现我的问题不是万一

The problem was that when I save my source code in UTF-8 format, my text editor add 3 bytes (BOM) at the beginning to indicate that encoding is UTF-8. 问题是,当我将源代码保存为UTF-8格式时,我的文本编辑器在开头添加了3个字节(BOM),以表示编码为UTF-8。

Q: What is a BOM? 问:什么是BOM?

A: A byte order mark (BOM) consists of the character code U+FEFF at the beginning of a data stream, where it can be used as a signature defining the byte order and encoding form, primarily of unmarked plaintext files. 答:字节顺序标记(BOM)由数据流开头的字符代码U + FEFF组成,在这里它可用作定义字节顺序和编码形式的签名,主要是未标记的纯文本文件。 Under some higher level protocols, use of a BOM may be mandatory (or prohibited) in the Unicode data stream defined in that protocol. 在某些更高级别的协议下,在该协议中定义的Unicode数据流中,可能必须(或禁止)使用BOM。

UTF-8, UTF-16, UTF-32 & BOM UTF-8,UTF-16,UTF-32和BOM

The error that I get was: 我得到的错误是:

1: Invalid char `\357' in expression
1: Invalid char `\273' in expression
1: Invalid char `\277' in expression

I'd say you need to change your text editor as BOM is not needed for UTF-8. 我说你需要的,因为不需要为UTF-8 BOM改变你的文本编辑器。 UTF-8 is not byte-order dependent. UTF-8与字节顺序无关。 See link text for details. 有关详细信息,请参见链接文本

Try using the mb_chars method from Rails' ActiveSupport framework: 尝试使用Rails的ActiveSupport框架中的mb_chars方法:

result = case utf8string.mb_chars
   when 'АБВ': 1
   when 'ГДИ': 2
   when 'ЙКЛ': 3
   when 'МНО': 4
   else 5
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM