在Python中使用petl模塊加載utf-8文件時出錯

Question

    import petl as etl

    file_name = 'name of file'
    file_in_memory = etl.fromcsv(file_name, encoding='utf-8')
    print (etl.look(file_in_memory))

    Traceback (most recent call last):
      File "<interactive input>", line 1, in <module>
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 13: ordinal not in range(128)

該文件包含導致錯誤的“ 20 Rue d'Estrées，75007巴黎，法國”。

我可以使用code.open（file_name，mode ='r'，encoding ='utf-8'）讀取文件，但希望能夠使用petl庫輕松地操作csv。

有什么辦法可以在保留字符的同時通過petl.fromcsv將其加載到內存中嗎？

Answer 1

首先需要使用chardet模塊找出文件的編碼。 通過使用通用檢測器功能，它可以遍歷文件的內容，並根據文件中的字符返回編碼。

返回帶有鍵“ encoding”的字典。

   from chardet.universaldetector import UniversalDetector
   import petl as etl

   detector = UniversalDetector()
   file_open = open(file_name)
   for line in file_open.readlines():
       detector.feed(line)
       if detector.done: break
   detector.close()
   file_open.close()
   file_encoding = detector.result['encoding']

   file_name = 'name of file'
   file_in_memory = etl.fromcsv(file_name, encoding=file_encoding)
   print (etl.look(file_in_memory))

如果需要多次，可以將文件編碼的檢測放入函數中。

在Python中使用petl模塊加載utf-8文件時出錯

問題描述

1 個解決方案

解決方案1
0 2016-01-26 08:47:59

在Python中使用petl模塊加載utf-8文件時出錯

問題描述

1 個解決方案

解決方案1 0 2016-01-26 08:47:59

解決方案1
0 2016-01-26 08:47:59