简体   繁体   中英

Why is it so slow to read an Excel file with Powershell?

I have a small Excel file with 28 KB in XLSX format and I would like to modify it with Powershell. The file contains 59 rows and 366 columns.

My code walks through the first column and searches for a specific entry and after that it walks through the column found and outputs the content of the found row and the fist row. This is the code:

# Define some parameters.
$year = "2015"
$filename = "C:\...\file.xlsx"
$person = "Lastname, Firstname"

# Open Excel file and select worksheet.
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open($filename)
$worksheet = $workbook.sheets.item($year)
$cells = $worksheet.cells

# Search person name in first column.
$rows = $worksheet.UsedRange.Rows.count
"Rows: $rows"
$row = 1

while ($row -le $rows)
{
  $cell = $cells.item($row,1).value2
  if ($person -eq $cell) {
    break
  }
  $row++
}

# List row
$cols = $worksheet.UsedRange.Columns.count
"Cols: $cols"
foreach ($col in 2..$cols)
{
  $date = $cells.item(1,$col).value2
  $data = $cells.item($row,$col).value2
  $date = [DateTime]::FromOADate($date)
  $msg = $date.ToString("yyyy-MM-dd") + " " + $data
  "$msg"
}

# Close workbook and Excel file and release COM object.
$workbook.close()
$excel.quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)

My problem: The program is terrible slow. It takes more than 5 minutes to iterate the 366 columns!

PS C:\...> Measure-Command { .\program.ps1 }

Days              : 0
Hours             : 0
Minutes           : 5
Seconds           : 33
Milliseconds      : 580
Ticks             : 3335806616
TotalDays         : 0,00386088728703704
TotalHours        : 0,0926612948888889
TotalMinutes      : 5,55967769333333
TotalSeconds      : 333,5806616
TotalMilliseconds : 333580,6616

I can hardly believe that this is normal. Instead I think that there is something really wrong with my program. But I have no idea what it is.

What do I have to change to make it faster?

Using loop and find to replace cell values in Excel will take you forever... I have 111 cells to replace and it takes about 40 secs to complete. However, you may exploit the command Replace which is considerably faster. But to provide a value from a relative cell you have to change your Excel application Reference style to xlR1c1. Below is my take on how I can replace all cells with string "No registered hostname" with a value of the cell to the left which is IP address for my data. I have commented out the while loop which I previously used Since you intend to do an update you may consider this...

$Range=$WorkSheet.Range("B1").EntireColumn
# Replace Cells with No registered hostname
$SearchString="No registered hostname"
# Using Excel reference style xlR1C1 to set the formula for replace
$xls.Application.ReferenceStyle=2 
$Range.Replace($SearchString, "=RC[-1]")
# while ($NoDNS=$Range.find($SearchString))
#   {
#   $NoDNS.Activate()
#   $RefRow=$NoDNS.Row
#   $NoDNS.value()=$WorkSheet.Cells.Item($RefRow, 1).Text
#   }
$xls.Application.ReferenceStyle=1

Using replace only takes a split second to complete all the necessary changes compare to previous while loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM