简体   繁体   English

如何将 xls 中的所有字段作为字符串导入 Pandas dataframe?

[英]How to import all fields from xls as strings into a Pandas dataframe?

I am trying to import a file from xlsx into a Python Pandas dataframe. I would like to prevent fields/columns being interpreted as integers and thus losing leading zeros or other desired heterogenous formatting.我正在尝试将文件从 xlsx 导入 Python Pandas dataframe。我想防止字段/列被解释为整数,从而丢失前导零或其他所需的异构格式。

So for an Excel sheet with 100 columns, I would do the following using a dict comprehension with range(99).因此,对于包含 100 列的 Excel 工作表,我将使用带范围 (99) 的字典理解来执行以下操作。

import pandas as pd
filename = 'C:\DemoFile.xlsx'

fields = {col: str for col in range(99)}

df = pd.read_excel(filename, sheetname=0, converters=fields)

These import files do have a varying number of columns all the time, and I am looking to handle this differently than changing the range manually all the time.这些导入文件确实始终具有不同数量的列,我希望以不同于一直手动更改范围的方式来处理这个问题。

Does somebody have any further suggestions or alternatives for reading Excel files into a dataframe and treating all fields as strings by default?对于将 Excel 文件读入 dataframe 并默认将所有字段视为字符串,是否有人有任何进一步的建议或替代方案?

Many thanks!非常感谢!

Use dtype=str when calling .read_excel()调用.read_excel()时使用 dtype dtype=str

import pandas as pd
filename = 'C:\DemoFile.xlsx'

df = pd.read_excel(filename, dtype=str)

Try this: 尝试这个:

xl = pd.ExcelFile(r'C:\DemoFile.xlsx')
ncols = xl.book.sheet_by_index(0).ncols
df = xl.parse(0, converters={i : str for i in range(ncols)})

UPDATE: 更新:

In [261]: type(xl)
Out[261]: pandas.io.excel.ExcelFile

In [262]: type(xl.book)
Out[262]: xlrd.book.Book

the usual solution is: 通常的解决方案是:

  1. read in one row of data just to get the column names and number of columns 读取一行数据只是为了获取列名和列数
  2. create the dictionary automatically where each columns has a string type 自动创建字典,其中每列都有一个字符串类型
  3. re-read the full data using the dictionary created at step 2. 使用在步骤2中创建的词典重新读取全部数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM