[英]Python read SAS generated XML type .xls file
I am trying to extract tabs from hundreds of SAS generated .xls files. 我正在尝试从数百个SAS生成的.xls文件中提取选项卡。 I tried the following approach without luck.
我没有运气就尝试了以下方法。 My version of
xlrd
is 0.9.2. 我的
xlrd
版本是0.9.2。
import xlrd
book = xlrd.open_workbook('out_1.xls')
The error message is: 错误消息是:
Traceback (most recent call last):[Finished in 0.2s with exit code 1]
File "I:\Dropbox\Sas data\sacwin\test.py", line 3, in <module>
book = xlrd.open_workbook('out_1.xls') # Open an .xls file
File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
ragged_rows=ragged_rows,
File "C:\Python27\lib\site-packages\xlrd\book.py", line 91, in open_workbook_xls
biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "C:\Python27\lib\site-packages\xlrd\book.py", line 1258, in getbof
bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
File "C:\Python27\lib\site-packages\xlrd\book.py", line 1252, in bof_error
raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '<?xml ve'
Once I opened the .xls
file in an editor the header looks like: 在编辑器中打开
.xls
文件后,标题如下:
<?xml version="1.0" encoding="windows-1252"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office">
Would you mind giving me some suggestions on how to parse these files? 您介意给我一些有关如何解析这些文件的建议吗? Thanks!
谢谢!
I'm looking for a solution to this problem as well. 我也在寻找解决这个问题的方法。 I can tell you that the file format is xml but pre-dates Excel 2007 'Office Open XML (ECMA-376)' format (I think it's SpreadsheetML), so it's not supported by xlrd.
我可以告诉您,文件格式是xml,但早于Excel 2007'Office Open XML(ECMA-376)'格式(我认为是SpreadsheetML),因此xlrd不支持该格式。
If there's no python library available and you have good prior knowledge of the structure of the files you need to process I'd just use an xml reader. 如果没有可用的python库,并且您对要处理的文件结构有很好的先验知识,那么我只会使用xml阅读器。
Regards Dave 问候戴夫
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.