简体   繁体   中英

Rails 3.1.0: incompatible character encodings: ASCII-8BIT and UTF-8

I'm using Rails 3.1.0 and Ruby 1.9.2 with PostgreSQL. I want to get data from huge files (~300mb) and put it in database. Here i use transaction:

File.open("./public/data_to_parse/movies/movies.list").each do |line|
  if line.match(/\t/)
    title = line.scan(/^[^\t(]+/)[0]
    title = title.strip if title 
    year = line.scan(/[^\t]+$/)[0]
    year = year.strip if year
    movie = Movie.find_or_create(title, year)
    temp.push(movie) if movie
    if temp.size == 10000
      Movie.transaction do
        temp.each { |t| t.save }
      end    
       temp =[]
    end
  end
end

But i want to improve perfomance using mass insert whith raw SQL:

temp.push"(\'#{title}\', \'#{year}\')" if movie
  if temp.size == 10000
   sql = "INSERT INTO movies (title, year) VALUES #{temp.join(", ")}" 
   Movie.connection.execute(sql)
   temp =[]
  end
end

But i have this error "incompatible character encodings: ASCII-8BIT and UTF-8". When i'm using activerecord it's all ok. Files contains characters such as German umlauts. I tried all from here Rails 3 - (incompatible character encodings: UTF-8 and ASCII-8BIT): , but it doesn't help me.

Do you have any idea where it comes from ?

Thanks,

Solved. Problem was in files encoding. They were in ISO_8859-1 and i converted it to UTF-8 via iconv.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM