简体   繁体   English

Rails 2.3.2 / Ruby 1.8.6编码问题-ActionController返回UTF-8吗?

[英]Rails 2.3.2/Ruby 1.8.6 Encoding Question - ActionController returning UTF-8?

I have a pretty simple Rails question regarding encoding that I can't find an answer to. 我有一个关于编码的非常简单的Rails问题,我找不到答案。

Environment: Rails 2.3.2/Ruby1.8.6 环境:Rails 2.3.2 / Ruby1.8.6

I am not setting any encoding options within the Rails environment currently, have left everything to defaults. 我目前未在Rails环境中设置任何编码选项,将所有内容保留为默认值。

If I read a String from disk from a text file - and send it via Rails render :text functionality using Apache/Phusion, what encoding should the client expect? 如果我从磁盘上从文本文件中读取一个字符串,然后使用Apache / Phusion通过Rails render:text功能发送它,那么客户端应该期望哪种编码?

Thank you for any answers, 谢谢您的回答,

Since about Rails 1.2, Rails sets Ruby 1.8's $KCODE magic variable to "UTF8". 从Rails 1.2开始,Rails将Ruby 1.8的$ KCODE魔术变量设置为“ UTF8”。 It includes ActiveSupport::CoreExtensions::String::Multibyte to patch around issues with otherwise ambiguous per-character/per-byte operators. 它包括ActiveSupport :: CoreExtensions :: String :: Multibyte,用于解决按字符/按字节运算符不明确的问题。 Your text file should be UTF-8, Ruby will pass it through and your application layout should specify a META tag declaring the document's charset to be UTF-8 too: 您的文本文件应为UTF-8,Ruby将通过它,并且您的应用程序布局应指定一个META标记,也将文档的字符集也声明为UTF-8:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Then it should all 'just work', but there are some gotchas described below. 然后一切都应该“正常工作”,但是下面描述了一些陷阱。

If you're on a Mac, running "script/console" in Terminal.app and then pasting unusual character sequences directly into the terminal from eg the Character Viewer is a good way to play around and demonstrate this to your own satisfaction, since the whole OS works in UTF-8. 如果您使用的是Mac,请在Terminal.app中运行“脚本/控制台”,然后从“角色查看器”直接将异常字符序列直接粘贴到终端中,这是一个很好的方法,可以演示并让您自己满意,因为整个操作系统都可以在UTF-8中运行。 I don't know what the equivalent would be for Windows or an arbitrary Linux distribution. 我不知道Windows或任意Linux发行版的等效性。

For example, "⇒" - RIGHTWARDS DOUBLE ARROW - is Unicode 21D2, UTF8 0xE2 (226), 0x87 (125), 0x92 (146). 例如,“⇒”-RIGHTWARDS DOUBLE ARROW-是Unicode 21D2,UTF8 0xE2(226),0x87(125),0x92(146)。 If I paste that into Terminal and ask for the byte values I get the expected result: 如果我将其粘贴到Terminal中并询问字节值,我将得到预期的结果:

>> $KCODE
=> "UTF8"
>> "⇒"
=> "\342\207\222"
>> puts "⇒"
⇒

...but... ...但...

>> "⇒"[0]
=> 226
>> "⇒"[1]
=> 135
>> "⇒"[2]
=> 146
>> "⇒"[3]
=> nil

Note how you're still getting byte access with "[]". 请注意,如何仍然使用“ []”进行字节访问。 See the documentation on the Multibyte extensions in the Rails API (for Rails 2.2, eg at http://railsapi.com/ ) if you want to do string operations, otherwise things like "foo.reverse" will do the wrong thing; 如果要执行字符串操作,请参见Rails API中的Multibyte扩展文档(对于Rails 2.2,例如,位于http://railsapi.com/ ),否则,诸如“ foo.reverse”之类的事情将做错事情; "foo.mb_chars.reverse" gets it right by using the "mb_chars" proxy. “ foo.mb_chars.reverse”通过使用“ mb_chars”代理来正确处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM