[英]pandas read excel as formatted
how do i get the values of a spreadsheet as they are formatted? 如何格式化电子表格的值? im working on spreadsheets with a currency format
我正在使用货币格式的电子表格
this for example: 这个例如:
ITEM NAME UNIT PRICE
item1 USD 99
item2 SGD 45
but the terms 'USD' and 'SGD' were added using the formatting capabilities of excel, and is not seen by the read_excel function of pandas. 但是使用excel的格式化功能添加了“USD”和“SGD”这两个术语,并且pandas的read_excel函数看不到它们。 i would get the values, but not the currency name.
我会得到价值,但不是货币名称。 i could only work on the spreadsheets as it is, and given that i have various spreadsheets with about 6-7 sheets each, i was hoping to have a pandas (or python)-level solution rather than an excel-level solution.
我只能按原样处理电子表格,并且鉴于我有各种各样的电子表格,每张约6-7张,我希望有一个pandas(或python)级别的解决方案,而不是一个excel级别的解决方案。
thanks guys. 多谢你们。
to Daniel, this is how i implemented the 'xlrd' engine, which didn't seem to do anything. 对Daniel来说,这就是我实现'xlrd'引擎的方式,它似乎没有做任何事情。
excel = pd.ExcelFile('itemlist.xlsx', sheetname=None)
master = pd.DataFrame(None)
for sheet in excel.sheet_names:
df = pd.read_excel(excel,sheet,header=2, engine='xlrd')
master=master.append(df)
There's not any great way to do this. 没有任何好方法可以做到这一点。
pandas
has no knowledge of the number formats, and xlrd
doesn't seem to be able to read formats from a .xlsx file - see here pandas
不知道数字格式, xlrd
似乎无法从.xlsx文件中读取格式 - 请参阅此处
You could use openpyxl
to accomplish this, it at least has access to the number formats, but it looks like you'd have to basically implement all the parsing logic yourself. 您可以使用
openpyxl
来完成此任务,它至少可以访问数字格式,但看起来您必须自己实现所有解析逻辑。
In [26]: from openpyxl import load_workbook
In [27]: wb = load_workbook('temp.xlsx')
In [28]: ws = wb.worksheets[0]
In [29]: ws.cell("B2") # numeric value = 4, formatted as "USD 4"
Out[29]: <Cell Sheet1.B2>
In [30]: ws.cell("B2").value
Out[30]: 4
In [31]: ws.cell("B2").number_format
Out[31]: '"USD "#'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.