讀取許多csv文件並將其寫入使用python編碼的utf8

Question

我正在使用python代碼從許多csv文件中讀取並將編碼設置為utf8。我在讀取文件時可以讀取所有行，但是在編寫時只能寫入1行，因此遇到了問題。 請幫助我檢查我的代碼，如下所示：

def convert_files(files, ascii, to="utf-8"):
for name in files:
#print ("Convert {0} from {1} to {2}").format(name, ascii, to)
    with open(name) as f:
        print(name)
        count = 0
        lineno = 0
        #this point I want to write the below text into my each new file at the first line           
        #file_source.write('id;nom;prenom;nom_pere;nom_mere;prenom_pere;prenom_mere;civilite (1=homme 2=f);date_naissance;arrondissement;adresse;ville;code_postal;pays;telephone;email;civilite_demandeur (1=homme 2=f);nom_demandeur;prenom_demandeur;qualite_demandeur;type_acte;nombre_actes\n')
        for line in f.readlines():
            lineno +=1
            if lineno == 1 :
                continue
            file_source = open(name, mode='w', encoding='utf-8', errors='ignore')
            #pass
            #print (line)
            # start write data to to new file with encode

            file_source.write(line)
            #file_source.close

#print unicode(line, "cp866").encode("utf-8")   
csv_files = find_csv_filenames('./csv', ".csv")
convert_files(csv_files, "cp866")

Answer 1

您需要在每次迭代中重新打開文件。

for line in f.readlines():
        lineno +=1
        if lineno == 1 :
            continue
        #move the following line outside of the for block
        file_source = open(name, mode='w', encoding='utf-8', errors='ignore')

Answer 2

如果您只需要更改文件的字符編碼，那么它們就是csv文件就沒關系，除非轉換可能會更改解釋為定界符，quotechar等的字符：

def convert(filename, from_encoding, to_encoding):
    with open(filename, newline='', encoding=from_encoding) as file:
        data = file.read().encode(to_encoding)
    with open(filename, 'wb') as outfile:
         outfile.write(data)

for path in csv_files:
    convert(path, "cp866", "utf-8")

添加errors參數以更改編碼/解碼錯誤的處理方式。

如果文件很大，則可以增量轉換數據：

import os
from shutil import copyfileobj
from tempfile import NamedTemporaryFile

def convert(filename, from_encoding, to_encoding):
    with open(filename, newline='', encoding=from_encoding) as file:
        with NamedTemporaryFile('w', encoding=to_encoding, newline='', 
                                dir=os.path.dirname(filename)) as tmpfile:
            copyfileobj(file, tmpfile)
            tmpfile.delete = False
    os.replace(tmpfile.name, filename) # rename tmpfile -> filename

for path in csv_files:
    convert(path, "cp866", "utf-8")

Answer 3

你可以這樣做

def convert_files(files, ascii, to="utf-8"):
    for name in files:
        with open(name, 'r+') as f:
            data = ''.join(f.readlines())
            data.decode(ascii).encode(to)
            f.seek(0)
            f.write(data)
            f.truncate()

讀取許多csv文件並將其寫入使用python編碼的utf8

問題描述

3 個解決方案

解決方案1
1 2013-12-13 04:27:58

解決方案2
0 2013-12-13 04:40:25

解決方案3
0 2013-12-13 04:43:09

讀取許多csv文件並將其寫入使用python編碼的utf8

問題描述

3 個解決方案

解決方案1 1 2013-12-13 04:27:58

解決方案2 0 2013-12-13 04:40:25

解決方案3 0 2013-12-13 04:43:09

解決方案1
1 2013-12-13 04:27:58

解決方案2
0 2013-12-13 04:40:25

解決方案3
0 2013-12-13 04:43:09