[英]Read .xlsx in .txt or .csv format with Python
Is there a way to read a .xlsx
file in .txt
or .csv
format with Python?有没有办法用 Python 读取
.txt
或.csv
格式的.xlsx
文件? Looking for a way to read an .xlsx
file while preserving number formatting (eg, $45.890924).寻找一种在保留数字格式的同时读取
.xlsx
文件的方法(例如,45.890924 美元)。 Searched around and could not find a viable module, and creating a style converter would be next to impossible with my Python skill level.四处搜索,找不到一个可行的模块,并且以我的 Python 技能水平创建样式转换器几乎是不可能的。
A few helpful notes, Pandas would not be an option because it automatically wipes the number formatting, and I cannot classify the column's format in advance since one column can contain 20+ different number formats.一些有用的注释,Pandas 不是一个选项,因为它会自动擦除数字格式,而且我无法提前对列的格式进行分类,因为一列可以包含 20 多种不同的数字格式。
openpyxl
stores the content of the cell in value
and the formatting in number_format
(and in a few other properties for alignment, color, font, border, etc). openpyxl
将单元格的内容存储在value
中,并将格式存储在number_format
中(以及对齐、颜色、字体、边框等的其他一些属性)。 So you could interpret the Excel format code and translate it to Python format - but因此,您可以解释 Excel 格式代码并将其转换为 Python 格式 - 但是
'_-* #,##0.00\ [$€-410]_-;\-* #,##0.00\ [$€-410]_-;_-* "-"??\ [$€-410]_-;_-@_-'
All that said, making a translator is not impossible.综上所述,做翻译并非不可能。 Below is a simple function to translate Excel date format strings to Python's
strftime()
directives.下面是将 Excel 日期格式字符串转换为 Python 的
strftime()
指令的简单函数。
def date_xl2py(dt, xlcode):
xl2py = {
'yy' : '%y',
'yyyy' : '%Y',
'm' : '%m', ##always zero-padded
'mm' : '%m',
'mmm' : '%b',
'mmmm' : '%B',
'mmmmm' : '%b', ##no single letter form
'd' : '%d', ##always zero-padded
'dd' : '%d',
'ddd' : '%a',
'dddd' : '%A',
'%' : '%%' ##escape the % char
}
pycode = []
for xlpart in findall(r'[d|m|y|h|s]+|.|(".+")', xlcode):
if xlpart in xl2py:
pycode.append(xl2py[xlpart])
else:
pycode.append(xlpart)
return ''.join(pycode)
dt = datetime(2022,7,12,15,56)
dt.strftime(date_xl2py(dt, 'ddd, mmmm dd, yyyy'))
'Tue, July 12, 2022'
Please note, I didn't take into account the specification of a locale.请注意,我没有考虑语言环境的规范。
Also, Excel offers three (rather useless) date formatting options that are not available in Python (see comments in the code): I just mapped them to the most similar option available.此外,Excel 提供了 Python 中不可用的三个(相当无用的)日期格式选项(请参阅代码中的注释):我只是将它们映射到最相似的可用选项。
And finally, if you were to add time formats, you would need to handle the fact that "mm" may be months or minutes in Excel, and select the right option based on context.最后,如果要添加时间格式,则需要处理“mm”在 Excel 中可能是月或分钟的事实,并根据上下文选择正确的选项。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.