I'm trying to encode csv file to utf8 using python

Question

I'm using python to read and encode many files to utf8 using python,I try it with the code below:

import os
from os import listdir

def find_csv_filenames(path_to_dir, suffix=".csv" ):
    path_to_dir = os.path.normpath(path_to_dir)
    filenames = listdir(path_to_dir)
#Check *csv directory

    fp = lambda f: not os.path.isdir(path_to_dir+"/"+f) and f.endswith(suffix)
    return [path_to_dir+"/"+fname for fname in filenames if fp(fname)]

def convert_files(files, ascii, to="utf-8"):
    count = 0
    lineno = 0
    for name in files:
        lineno = lineno+1
        with open(name) as f:
            file_target = open(name, mode='r', encoding='latin-1')
            file_content = file_target.read()
            file_target.close

        print(lineno)
        file_source = open("./csv/data{}.csv".format(lineno), mode='w', encoding='utf-8')
        file_source.write(file_content) 

csv_files = find_csv_filenames('./csv', ".csv")
convert_files(csv_files, "cp866")

The problem is that after I read and write data to other files and set encode it to utf8 but it still not work.

Answer 1

Before you open a file which encoding is not clear, you could use chardet to detect the file's encoding rather than use a encoding guessed to open a file. Usage is like this:

>>> import chardet
>>> encoding = chardet.detect('PATH/TO/FILE')['encoding']

And then open the file with the encoding detected and write the contents into a file opened with 'utf-8' encoding.

If you're not sure whether the file is converted using 'utf-8' encoding, you could use enca to see if the encoding of the file is 'ASCII' or 'utf-8' like this in Linux shell:

$ enca FILENAME

I'm trying to encode csv file to utf8 using python

Question

1 answers

solution1
0 2013-12-13 06:55:27

I'm trying to encode csv file to utf8 using python

Question

1 answers

solution1 0 2013-12-13 06:55:27

solution1
0 2013-12-13 06:55:27