简体   繁体   中英

How to read embedded XLS file from URL

I must read an excel file from an URL and get its data. File's name is companies.xls but when I open this file with notepad I have found its like embedded xls into HTML. Microsoft Excel opens this file without any problem and I can save this file with xls extension but I must do it programmatically or I must find any other way to read this file. How can I read this file's data? Since it is not pure xslt Apache poi file system gives below error;

java.io.IOException: Invalid header signature; read 0x6D78206C6D74683C, expected 0xE11AB1A1E011CFD0

Here is my code;

URL companyList= new URL("someURL.xslt");
        InputStream inputStream = companyList.openStream();
        POIFSFileSystem fs = new POIFSFileSystem(inputStream);

xls file or html;

<head>
<meta http-equiv=Content-Type content="text/html; charset=Windows-1254">
<meta name=sssId content=Excel.Sheet>

 <style type="text/css">
body,table,tr,th,td {font-family:Arial;font-size:11pt;color:#000;}
.th {padding:3em;background:#3366FF;text-align:center;color:#ffffff;}
 .td {padding:2em;background:#ffffff;}
 </style>

<!--[if gte mso 9]><xml>
 <x:ExcelWorkbook>
 <x:ExcelWorksheets>
<x:ExcelWorksheet>
<x:Name>Companies</x:Name>
<x:WorksheetOptions>
 <x:DefaultRowHeight>285</x:DefaultRowHeight>
 <x:FreezePanes/>
 <x:FrozenNoSplit/>
 <x:SplitHorizontal>1</x:SplitHorizontal>
 <x:TopRowBottomPane>1</x:TopRowBottomPane>
 <x:ActivePane>2</x:ActivePane>
 <x:Panes>
  <x:Pane>
   <x:Number>3</x:Number>
  </x:Pane>
  <x:Pane>
   <x:Number>2</x:Number>
   </x:Pane>
  </x:Panes>
 </x:WorksheetOptions>
</x:ExcelWorksheet>
 </x:ExcelWorksheets>
</x:ExcelWorkbook>
 </xml><![endif]-->

 </head>
<body>
<table width="100%" border="1">
 <tr><th width="85%" class="th">Company Name&#305;</th><th width="15%"     class="th">City</th></tr>
 <tr><td width="100%" class="td">Microsoft</td><td width="100" align="center" class="td" nowrap>sValley</td> </tr>
<tr><td width="100%" class="td">Google</td><td width="100" align="center" class="td" nowrap>london</td></tr>
....

I had the same issue. The fix for this issue is change the charset=Windows-1254 to charset=utf-8 in that xls file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM