简体   繁体   English

在不同环境中读取相同文件时,Ruby字符编码混乱

[英]Ruby Character Encoding Confusion When Reading Same File In Different Environments

I have a Rails application that accepts file uploads of CSV files. 我有一个Rails应用程序,可以接受CSV文件的文件上传。 When developing the feature locally on my Mac, I received an "invalid byte sequence in UTF-8" error when trying to parse the uploaded file (using Ruby's standard library CSV). 在Mac上本地开发功能时,尝试解析上传的文件时(使用Ruby的标准库CSV),我收到“ UTF-8中无效的字节序列”错误。

So after doing some research and reading some answers to similar questions on StackOverflow, I tried using a gem to sniff out the character encoding (namely CharDet), and then when opening the file via the CSV library, I would specify the encoding. 因此,在做完一些研究并阅读了关于StackOverflow上类似问题的一些答案之后,我尝试使用gem来嗅探字符编码(即CharDet),然后在通过CSV库打开文件时,我将指定编码。 And this solved all my problems, and life was good. 这解决了我所有的问题,生活很美好。

    content = File.read(fullpath)
    self.file_encoding = CharDet.detect(content)['encoding']
    CSV.table(fullpath, :encoding => file_encoding, :header_converters => :downcase).headers

But then I deployed this code to the production Linux environment, and again with the "invalid byte sequence in UTF-8" errors. 但是随后,我将此代码部署到了生产Linux环境,并再次出现了“ UTF-8中无效的字节序列”错误。 What a mystery (to me anyway)! 真是个谜(无论如何对我来说)! After quite some time trying to resolve the error, I tried removing the code that specified the encoding upon opening the file. 经过一段时间的尝试来解决该错误之后,我尝试在打开文件时删除指定编码的代码。 And miraculously it fixed the problem on production, but now local Mac development is broken. 奇迹般地,它解决了生产中的问题,但是现在本地Mac的开发已中断。

Keep in mind, that in both cases I'm uploading the same file using the same browser. 请记住,在两种情况下,我都是使用相同的浏览器上传相同的文件。 Does anyone have any insight on what is going on here? 有人对这里发生的事情有任何见解吗?

By the way, versions of ruby are close, but not the same. 顺便说一句,红宝石的版本很接近,但并不相同。 The Mac is ruby 1.9.3-p0 , and the Linux server is 1.9.2-p180 . Mac是ruby 1.9.3-p0 ,而Linux服务器是1.9.2-p180 The app is Rails 3.2.6 . 该应用程序是Rails 3.2.6

A few thoughts: 一些想法:

  1. Have you confirmed the encoding of the file that you're uploading? 您是否已确认要上传的文件的编码?
  2. Have you tested with 1.9.2-p180 on your Mac, as Frederick Cheung suggested? 您是否按照Frederick Cheung的建议在Mac上使用1.9.2-p180进行了测试?
  3. Have you tried outputting the results of CharDet.detect on each platform to see what the encoding of the received file (as opposed to the uploaded file) is? 您是否尝试过在每个平台上输出CharDet.detect的结果,以查看所接收文件(与上载文件相对)的编码是什么? I wonder if some configuration is different between Apache on Linux and WEBrick on your Mac? 我想知道Linux上的Apache和Mac上的WEBrick之间的配置是否有所不同吗?
  4. Are you using the same version of CharDet on both platforms? 您是否在两个平台上使用相同版本的CharDet? What libraries does it use (eg iconv), and are they the same version on both platforms? 它使用哪些库(例如iconv),并且在两个平台上它们是否是相同版本?

I'm not aware of any differences in behavior with regard to encoding between 1.9.2 and 1.9.3, but I haven't specifically researched it either. 我不知道1.9.2和1.9.3之间在编码方面的行为差异,但是我也没有专门研究它。 It could also be a difference in the configuration of the MRI build. MRI版本的配置也可能有所不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从文本文件中读取字符串时出现顽固的字符编码错误(Ruby / Rails) - Stubborn character encoding errors when reading strings from text file (Ruby/Rails) Ruby On Rails:为环境创建不同种子文件的方法 - Ruby On Rails: way to create different seeds file for environments Ruby on Rails中包含的不同环境 - Different environments included in Ruby on Rails ruby OpenSSL RSA 字符编码 - ruby OpenSSL RSA character encoding 与安装不同的红宝石宝石混淆 - Confusion with installing different ruby gems 在多个环境中运行相同的Ruby应用程序 - Running same Ruby application in multiple environments 在Ruby 1.9中使用不同的编码在数据库中移动数据:“ Encoding :: CompatibilityError:不兼容的字符编码:UTF-8和ISO-8859-1” - Move data across database with different encoding in ruby 1.9: “Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ISO-8859-1” 将文件写入S3时的字符编码问题 - Character encoding issue when writing a file to S3 使用Ruby on Rails进行字符编码-操作系统依赖性 - Character Encoding with Ruby on Rails - OS dependency 如何在 Ruby on Rails 中为控制器中的不同环境使用不同的对象? - How to use a different object for different environments in the controller within Ruby on Rails?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM