简体   繁体   English

Python:使用熊猫从excel转换为CSV时,请保持前导零

[英]Python: Keep leading zeroes when converting from excel to CSV with pandas

I have an excel sheet that is to be inserted into a database. 我有一个要插入数据库的Excel工作表。 I wrote a python script, which takes an excel file, converts it into a CSV and then inserts it to the database. 我编写了一个python脚本,该脚本接受一个excel文件,将其转换为CSV,然后将其插入数据库。 The problem is that the excel sheet contains zipcodes, which unfortunately removes the leading zeroes. 问题在于excel工作表包含邮政编码,很遗憾,该邮政编码会删除前导零。

Here is my code that reads the excel sheet and puts it into a csv: 这是我的代码,可读取excel工作表并将其放入csv:

def excel_to_csv():
    xlsx = pd.read_excel(excel_path + fileName + '.xlsx')
    xlsx.to_csv(csv_file, encoding='utf-8', index=False, na_rep=None, quoting=csv.QUOTE_NONE)


excel_to_csv()

And then I use this code to insert it into the database: 然后,我使用以下代码将其插入数据库:

with open(csv_file, 'rb') as f:
    reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
    next(reader)
    for row in reader:
        cur.execute(
            "INSERT INTO table (foo1, foo2, zipcode, foo3) VALUES (%s, %s, %s, %s); ",
            row
        )

conn.commit()

When I print out my csv after its converted from excel, I get this result: 从excel转换后打印出csv时,得到以下结果:

foo1,foo2,zipcode,foo3
353453452,DATA,37,CITY
463464356,DATA,2364,CITY

The zipcode cell in the excel file is converted into text so it keeps the leading zeroes, but how can I keep the leading zeroes when I convert the excel file into csv? 将excel文件中的邮政编码单元格转换为文本,以便保留前导零,但是将excel文件转换为csv时如何保留前导零?

From the docs : 文档

dtype : Type name or dict of column -> type, default None dtype :类型名称或列的字典->类型,默认为无
Data type for data or columns. 数据或列的数据类型。 Eg {'a': np.float64, 'b': np.int32} Use object to preserve data as stored in Excel and not interpret dtype. 例如{'a':np.float64,'b':np.int32}使用对象将数据保存为Excel中存储的内容,而不解释dtype。 If converters are specified, they will be applied INSTEAD of dtype conversion. 如果指定了转换器,则会将它们应用于dtype转换的INSTEAD。
New in version 0.20.0. 0.20.0版中的新功能。

So you can tell pd.read_excel to not interpret the data by setting the dtype -kwarg to object : 所以,你可以告诉pd.read_excel不通过设置解释数据dtype -kwarg到object

xlsx = pd.read_excel(excel_path + fileName + '.xlsx', dtype='object')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM