简体   繁体   English

PowerShell | OLEDB | Excel | 无法从第三方应用程序创建的XLSX中读取数据

[英]PowerShell | OLEDB | Excel | Unable to read data from XLSX created from third-party app

**Update (2012.12.13) - added sample code, input formatting, output sample **更新(2012.12.13) - 添加了示例代码,输入格式,输出示例

I've been fighting with this for a few days, now, and I've run out of ideas. 我已经和它斗争了几天,现在,我已经没有想法了。 I've tested a script (and can upload later once I have it in front of me) successfully against multiple XLSX files. 我已经成功地对多个XLSX文件测试了一个脚本(并且可以在我面前完成后再上传)。 I've got the connection string functional, parsing the data that I need, etc... 我有连接字符串功能,解析我需要的数据等...

The issue is that when I attempt to process my input files (generated from a third-party reporting application) the data is not read from the worksheet. 问题是,当我尝试处理输入文件(从第三方报告应用程序生成)时,不会从工作表中读取数据。

If I open and save the input file within Excel (no format changes, no data entry/removal, no modifications at all), the input file will then process and parse all data. 如果我在Excel中打开并保存输入文件(没有格式更改,没有数据输入/删除,根本没有修改),输入文件将处理并解析所有数据。

I've tried using multiple 'Extended Properties' settings in the connection string, to no avail, including HDR=Yes/No and IMEX=1. 我尝试在连接字符串中使用多个“扩展属性”设置,但无效,包括HDR =是/否和IMEX = 1。

Anyone ever see anything like this before? 以前有人见过这样的事吗?


#inputFile_original.xlsx will not parse the data from the worksheet
#inputFile_original_reSaved.xlsx parses the data without any issues

$fileName  = "inputFile_original.xlsx"
#$fileName = "inputFile_original_reSaved.xlsx"
$filePath  = ".\OLEDB\test\"

#Build the connection string
$ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source="
$ConnectionString += (Join-Path -Path $filePath -ChildPath $fileName)
$ConnectionString += ";OLE DB Services=-4;Extended Properties="
$ConnectionString += '"Excel 12.0 Xml;HDR=YES;IMEX=1";'

$conn   = New-Object System.Data.OleDb.OleDbConnection($ConnectionString)
$conn.Open()

$tables = $conn.GetOleDbSchemaTable([System.Data.OleDb.OleDbSchemaGuid]::tables,$null)

$cmd    = New-Object System.Data.OleDb.OleDbCommand("Select * FROM [$($tables.rows[0].TABLE_NAME)]",$conn)

$da     = New-Object System.Data.OleDb.OleDbDataAdapter($cmd)
$ds     = New-Object System.Data.DataSet
$da.Fill($ds)

#Output the data to the console
$ds.tables

Also, the input file is not formatted in a really easily used layout. 此外,输入文件未格式化为非常容易使用的布局。 Again, due to being generated from a third-party application. 同样,由于是从第三方应用程序生成的。

There are blank lines and the header row doesn't start on row 1. 有空行,标题行不在第1行开始。

A           B           C           D           E           F
   --------------------------------------------------------------------------
01 | ReportTitle
02 |
03 | ColHeader1  ColHeader2  ColHeader3  ColHeader4  ColHeader5  ColHeader6
04 | Data        Data        Data        Data        Data        Data
05 | Data        Data        Data        Data        Data        Data
06 | Data        Data        Data        Data        Data        Data
07 | Data        Data        Data        Data        Data        Data
08 | Data        Data        Data        Data        Data        Data
09 |
10 | Total: 5

The output that I'm receiving is shown below. 我收到的输出如下所示。
(A) original file (一)原始档案

Report Title :
F2           :
F3           :
F4           :
F5           :
F6           :

Report Title : ColHeader1
F2           : 
F3           : 
F4           : 
F5           : 
F6           :

(B) re-saved file (B)重新保存文件

Report Title :
F2           :
F3           :
F4           :
F5           :
F6           :

Report Title : ColHeader1
F2           : ColHeader2
F3           : ColHeader3
F4           : ColHeader4
F5           : ColHeader5
F6           : ColHeader6

Report Title : Data
F2           : Data
F3           : Data
F4           : Data
F5           : Data
F6           : Data

Report Title : Data
F2           : Data
F3           : Data
F4           : Data
F5           : Data
F6           : Data

Report Title : Data
F2           : Data
F3           : Data
F4           : Data
F5           : Data
F6           : Data

Report Title : Data
F2           : Data
F3           : Data
F4           : Data
F5           : Data
F6           : Data

Report Title : Data
F2           : Data
F3           : Data
F4           : Data
F5           : Data
F6           : Data

Report Title :
F2           :
F3           :
F4           :
F5           :
F6           :

Report Title : Total: 5
F2           :
F3           :
F4           :
F5           :
F6           :

Rather than opening the Excel files with OLEDB, you could just open them in Excel from Powershell... This is a quick sample that just prints the size of sheets. 您可以只使用Powershell在Excel中打开它们,而不是使用OLEDB打开Excel文件...这是一个快速示例,只打印工作表的大小。 The only issue I learned with it was that you had to fully quit out of Excel every time and purge any remnant of it (the GC stuff at the end.) 我学到的唯一问题是你每次都必须完全退出Excel并清除它的残余物(最后的GC内容)。

foreach($File in $excelFiles)
{
   $excel = New-Object -comobject Excel.Application
   $excel.visible = $false
   $workbook = $excel.Workbooks.Open($File)

   write-host "There are $($workbook.Sheets.count) sheets in $File"
   For($i = 1 ; $i -le $workbook.Sheets.count ; $i++)
   {
     $worksheet = $workbook.sheets.item($i)
     $rowMax = ($worksheet.usedRange.rows).count
     $columnMax = ($worksheet.usedRange.columns).count
     write-host "Sheet $($i) ($($worksheet.Name)) has dimensions $($rowMax) x $($columnMax)"

   $worksheet = $rowMax = $columnMax = $null
   } #end for
$workbook.close($false)
$workbook = $null
$excel.quit()
$excel = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()

} #end foreach

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM