简体   繁体   English

Python - 将Zip代码作为字符串加载到DataFrame中?

[英]Python - Loading Zip Codes into a DataFrame as Strings?

I'm using Pandas to load an Excel spreadsheet which contains zip code (eg 32771). 我正在使用Pandas加载包含邮政编码的Excel电子表格(例如32771)。 The zip codes are stored as 5 digit strings in spreadsheet. 邮政编码在电子表格中存储为5位数字符串。 When they are pulled into a DataFrame using the command... 使用命令将它们拉入DataFrame时...

xls = pd.ExcelFile("5-Digit-Zip-Codes.xlsx")
dfz = xls.parse('Zip Codes')

they are converted into numbers. 他们被转换成数字。 So '00501' becomes 501. 所以'00501'变成了501。

So my questions are, how do I: 所以我的问题是,我该怎么做:

a. 一种。 Load the DataFrame and keep the string type of the zip codes stored in the Excel file? 加载DataFrame并保存存储在Excel文件中的邮政编码的字符串类型?

b. Convert the numbers in the DataFrame into a five digit string eg "501" becomes "00501"? 将DataFrame中的数字转换为五位数字符串,例如“501”变为“00501”?

As a workaround, you could convert the int s to 0-padded strings of length 5 using Series.str.zfill : 作为解决方法,您可以使用Series.str.zfillint转换为长度为5的0填充字符串:

df['zipcode'] = df['zipcode'].astype(str).str.zfill(5)

Demo: 演示:

import pandas as pd
df = pd.DataFrame({'zipcode':['00501']})
df.to_excel('/tmp/out.xlsx')
xl = pd.ExcelFile('/tmp/out.xlsx')
df = xl.parse('Sheet1')
df['zipcode'] = df['zipcode'].astype(str).str.zfill(5)
print(df)

yields 产量

  zipcode
0   00501
str(my_zip).zfill(5)

or 要么

print("{0:>05s}".format(str(my_zip)))

are 2 of many many ways to do this 是许多方法中的两个

You can avoid panda's type inference with a custom converter, eg if 'zipcode' was the header of the column with zipcodes: 您可以使用自定义转换器避免使用panda的类型推断,例如,如果'zipcode'是带有zipcodes的列的标题:

dfz = xls.parse('Zip Codes', converters={'zipcode': lambda x:x})

This is arguably a bug since the column was originally string encoded, made an issue here 这可能是一个错误,因为该列最初是字符串编码的, 这里提出了一个问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM