简体   繁体   中英

Rails 2.3.2/Ruby 1.8.6 Encoding Question - ActionController returning UTF-8?

I have a pretty simple Rails question regarding encoding that I can't find an answer to.

Environment: Rails 2.3.2/Ruby1.8.6

I am not setting any encoding options within the Rails environment currently, have left everything to defaults.

If I read a String from disk from a text file - and send it via Rails render :text functionality using Apache/Phusion, what encoding should the client expect?

Thank you for any answers,

Since about Rails 1.2, Rails sets Ruby 1.8's $KCODE magic variable to "UTF8". It includes ActiveSupport::CoreExtensions::String::Multibyte to patch around issues with otherwise ambiguous per-character/per-byte operators. Your text file should be UTF-8, Ruby will pass it through and your application layout should specify a META tag declaring the document's charset to be UTF-8 too:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Then it should all 'just work', but there are some gotchas described below.

If you're on a Mac, running "script/console" in Terminal.app and then pasting unusual character sequences directly into the terminal from eg the Character Viewer is a good way to play around and demonstrate this to your own satisfaction, since the whole OS works in UTF-8. I don't know what the equivalent would be for Windows or an arbitrary Linux distribution.

For example, "⇒" - RIGHTWARDS DOUBLE ARROW - is Unicode 21D2, UTF8 0xE2 (226), 0x87 (125), 0x92 (146). If I paste that into Terminal and ask for the byte values I get the expected result:

>> $KCODE
=> "UTF8"
>> "⇒"
=> "\342\207\222"
>> puts "⇒"
⇒

...but...

>> "⇒"[0]
=> 226
>> "⇒"[1]
=> 135
>> "⇒"[2]
=> 146
>> "⇒"[3]
=> nil

Note how you're still getting byte access with "[]". See the documentation on the Multibyte extensions in the Rails API (for Rails 2.2, eg at http://railsapi.com/ ) if you want to do string operations, otherwise things like "foo.reverse" will do the wrong thing; "foo.mb_chars.reverse" gets it right by using the "mb_chars" proxy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM