简体   繁体   中英

Python Convert Excel to CSV

Seems there are a lot of posts on this subject and my solution is in line with what the most common answer seems to be, however I'm encountering an encoding error that I don't know how to address.

>>> def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(worksheet.row_values(rownum))

     csvfile.close()

>>> Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", 
              r"C:\Temp\StoreList.csv")

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", r"C:\Temp\StoreList.csv")
File "<pyshell#1>", line 10, in Excel2CSV
wr.writerow(worksheet.row_values(rownum))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 14:
ordinal not in range(128)
>>>

Any help or insight is greatly appreciated.

As @davidism points out, the Python 2 csv module doesn't work with unicode. You can work around this by converting all of your unicode objects to str objects before submitting them to csv :

def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(
             list(x.encode('utf-8') if type(x) == type(u'') else x
                  for x in worksheet.row_values(rownum)))

     csvfile.close()

The Python 2 csv module has some problems with unicode data. You can either encode everything to UTF-8 before writing, or use the unicodecsv module to do it for you.

First pip install unicodecsv . Then, instead of import csv , just import unicodecsv as csv . The API is the same (plus encoding options), so no other changes are needed.

Another fashion for doing this: cast to string, so as you have a string, you may codify it as "utf-8".

str(worksheet.row_values(rownum)).encode('utf-8')

The whole function:

def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(str(worksheet.row_values(rownum)).encode('utf-8'))

     csvfile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM