简体   繁体   English

如何在没有Excel或第3方工具的情况下将xlsx文件读入.net?

[英]How can I read xlsx files into .net without Excel or 3rd party tools?

I need to read an Excel xlsx file over the network, without using 3rd party tools or having Office installed. 我需要通过网络读取Excel xlsx文件,而无需使用第三方工具或安装Office。 I can write in Powershell or C# for this project and am not having luck finding a solution for either language with my requirements. 我可以用Powershell或C#编写此项目,但没有运气能找到满足我要求的任何一种语言的解决方案。 It needs to be native .net. 它必须是本地.net。

Does anyone have an example of how to do this? 有没有人举过一个例子?

This solution doesn't work powershell excel access without installing Excel 如果不安装Excel,此解决方案将无法对Powershell excel进行访问

As I do not have 'Microsoft.Jet.OLEDB.4.0' registered and apparently from searching, you can only use it on 32 bit machines? 由于我没有注册“ Microsoft.Jet.OLEDB.4.0”,显然是从搜索中获得的,您只能在32位计算机上使用它吗?

ImportExcel PowerShell module is a wrapper for EPPLus.dll . ImportExcel PowerShell模块是EPPLus.dll的包装。 It provide simple commmands that you can call to interact with Excel files (xlxs only). 它提供了简单的命令,您可以调用这些命令来与Excel文件进​​行交互(仅xlxs)。

The repository is found here, written by Doug Finke: https://github.com/dfinke/ImportExcel 该存储库位于此处,由Doug Finke编写: https : //github.com/dfinke/ImportExcel

You can get it from the PowerShell Gallery as well with Install-Module ImportExcel 您可以通过PowerShell库以及Install-Module ImportExcel获取它。

If you decide use C# you could use OpenXML. 如果决定使用C#,则可以使用OpenXML。 Just install it using the Nuget package manager. 只需使用Nuget软件包管理器进行安装。 It doesn't need Office installed. 它不需要安装Office。

Example: 例:

      using (SpreadsheetDocument spreadsheetDocument =   SpreadsheetDocument.Open("myFile.xlsx", false))
        {
            WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
            WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
            SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
            foreach (Row r in sheetData.Elements<Row>())
            {
                foreach (Cell c in r.Elements<Cell>())
                {
                    Console.Write(c.InnerText + ",");
                }
                Console.WriteLine("\n");
            }
            Console.WriteLine();
            Console.ReadKey();
        }

And add this to the using's 并将其添加到使用的

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

This is not a complete solution for all scenarios as I didn't have time to map out every xml, but this should get anyone started with a similar situation. 对于我来说,这并不是一个完整的解决方案,因为我没有时间来绘制每个xml,但这应该会让任何人都陷入类似的情况。

Hopefully this will save someone else time in having to develop something similar, the code below converts an xlsx into a zip file, decompresses it and then maps the different xmls to create a virtual representation of the spreadsheet. 希望这可以节省其他人开发类似内容的时间,下面的代码将xlsx转换为zip文件,对其进行解压缩,然后映射不同的xml以创建电子表格的虚拟表示形式。

It's not the fastest code, but if you were looking for speed you wouldn't use Powershell anyway. 它不是最快的代码,但是如果您追求速度,则无论如何都不会使用Powershell。

<#

    $columnListOfTuples and $rowListOfTuples have everything mapped

    Each xlsx file is a compressed group of xml files into a zip file that is renamed .xlsx
    There is an xml of shared strings that are then represented by a single numerical value to save space
    There is an xml for each sheet in the spread sheet that has numerical references to the values in the shared strings xml
    To produce a virtual/programmatic representation of the spreadsheet, the code will read the sheet, then read the numerical reference than then cross reference that to the shared strings value
    I have no mapped out all functions and xmls in the compressed file, but this is enough to get you started
    It's only configured for columns A - Z, you'll have to write code for additional columsn
    You need powershell 5.0 or later to have the native unzip feature
    It assumes your first worksheet is named worksheet1
#>

cls;

#Region Read Xlsx file

    #Region Create temporary zip file and location

    $fileDate = [DateTime]::Now.ToShortDateString();
    $fileDate = $fileDate.Replace("/", "-");
    $fileTime = [DateTime]::Now.ToShortTimeString();
    $fileTime = $fileTime.Replace(":", ".");
    $fileTime = $fileTime.Replace(" ", ".");
    $fileDateTime = $fileDate + "_" + $fileTime;

    $networkFile = "http://[xxxxx].xlsx";
    $originalFile = "c:\temp\file.xlsx"; # file must exist and location must have writable rights
    #Invoke-WebRequest -Uri $networkFile -OutFile $originalFile -UseDefaultCredentials; # this is to download a network file locally, may not be needed

    $index = $originalFile.LastIndexOf("\");
    $originalFileDirectory = $originalFile.Remove($index);
    $tempCopiedFileName = $originalFileDirectory + "\tempXlsx-" + $fileDateTime + ".zip";
    $desinationPath = $originalFileDirectory + "\xlsxDecompressed";

    #Endregion /Create temporary zip file and location

    #Region Map all of the zip/xlsx file paths

    $relsXmlFile = $desinationPath + "\_rels\.rels";
    $item1XmlRelsXmlFile = $desinationPath + "\customXml\_rels\item1.xml.rels";
    $item2XmlRelsXmlFile = $desinationPath + "\customXml\_rels\item2.xml.rels";
    $item3XmlRelsXmlFile = $desinationPath + "\customXml\_rels\item3.xml.rels";
    $item1XmlFile = $desinationPath + "\customXml\item1.xml";
    $item2XmlFile = $desinationPath + "\customXml\item2.xml";
    $item3XmlFile = $desinationPath + "\customXml\item3.xml";
    $itemProps1XmlFile = $desinationPath + "\customXml\itemProps1.xml";
    $itemProps2XmlFile = $desinationPath + "\customXml\itemProps2.xml";
    $itemProps3XmlFile = $desinationPath + "\customXml\itemProps3.xml";
    $appXmlFile = $desinationPath + "\[filename]\docProps\app.xml";
    $coreXmlFile = $desinationPath + "\[filename]\docProps\core.xml";
    $customXmlFile = $desinationPath + "\[filename]\docProps\custom.xml";
    $workbookXmlRelsXmlFile = $desinationPath + "\xl\_rels\workbook.xml.rels";
    $theme1XmlFile = $desinationPath + "\xl\theme\theme1.xml";
    $sheet1XmlRelsXmlFile = $desinationPath + "\xl\worksheets|_rels|sheet1.xml.rels";
    $workSheet1XmlFile = $desinationPath + "\xl\worksheets\sheet1.xml";
    $sharedStringsXmlFile = $desinationPath + "\xl\sharedStrings.xml";
    $stylesXmlFile = $desinationPath + "\xl\styles.xml";
    $workbookXmlFile = $desinationPath + "\xl\workbook.xml";
    $contentTypesXmlFile = $desinationPath + "\[Content_Types].xml";

    #Endregion /Map all of the zip/xlsx file paths

    Copy-Item $originalFile -Destination $tempCopiedFileName -Recurse; # copy the xlsx file as a temporary zip file
    Expand-Archive -LiteralPath $tempCopiedFileName -DestinationPath $desinationPath -Force; # unzip the file
    [xml]$fileLines = Get-Content($sharedStringsXmlFile); # read the contents of the shared strings file

    #Region Map shared strings

    $arr_SharedStrings = New-Object System.Collections.Generic.List[String];
    for ($i = 0; $i -lt $fileLines.sst.si.Count; $i++ )
    {
        $line = $fileLines.sst.si[$i];
        $tuple = [tuple]::create([int]$i, [string]("`"" + $line.InnerText + "`"") );
        $arr_SharedStrings.Add($tuple); # map the shared strings to the representational number inside of the xml files
    }

    #Endregion /Map shared strings

    [xml]$fileLines = Get-Content($workSheet1XmlFile);
    $sheetData = $fileLines.worksheet.sheetData;

    #Region Map rows
    $arr_sheet1_rows = New-Object System.Collections.ArrayList; # this will have a list/tuple of each cell that makes up a row
    for ($i = 0; $i -lt $sheetData.row.Count; $i++ )
    {
        $arr_sheet1_individualRow = New-Object System.Collections.Generic.List[String];
        $rowElements = $sheetData.row[$i].c; #grab data of all cells in that row
        foreach ($element in $rowElements)
        {
            $elementsTuple = [tuple]::create([string]$element.r, [string]$element.s, [string]$element.v)
            $arr_sheet1_individualRow.Add($elementsTuple);      
        }

        $rowTuple = [Tuple]::create([int] $i+1, $arr_sheet1_individualRow);
        $arr_sheet1_rows.Add($rowTuple) | Out-Null;
    }

    $rowListOfTuples = New-Object System.Collections.ArrayList; # has row number, then column letter, then cell value
    foreach ($rowMapped in $arr_sheet1_rows)
    {
        $cellValues = $rowMapped.Item2;
        $rowNumber = $rowMapped.Item1;
        foreach ($cellValue in $cellValues)
        {
            $cellValue = $cellValue.Remove(0, 1);
            $cellValue = $cellValue.Remove($cellValue.Length - 1, 1);
            $cellValue = $cellValue.Replace(" ", "");
            $cellValuesSplit = $cellValue.Split(",");
            $columnLetter = $cellValuesSplit[0];
            $columnLetter = $columnLetter -replace '[0-9]',''
            $cellValueOnly = $cellValuesSplit[2];

            $cellTuple = [tuple]::create([int]$rowNumber, [string]$columnLetter, [string]$cellValueOnly);
            $rowListOfTuples.Add($cellTuple) | Out-Null;
        }
    }

    #Endregion /Map shared rows

    #Region Map shared columns

    $rowCount = $arr_sheet1_rows.Count;
    $sheetDataCols = $fileLines.worksheet.cols;
    $arr_sheet1_columns = New-Object System.Collections.ArrayList; # this will have a list/tuple of each cell that makes up a column
    #$arr_sheet1_columns.Add("Place holder") | Out-Null;
    #for ($i = 0; $i -lt $sheetDataCols.col.Count; $i++ )
    for ($i = 0; $i -lt $rowCount; $i++ )
    {
        if ($arr_sheet1_columns.Count -lt $sheetDataCols.col.Count)
        {
            for ($j = 0; $j -lt $sheetDataCols.col.Count; $j++ )
            {
                switch  ($j)
                {
                    0 { $columnLetterTranslated = "A"; break; }
                    1 { $columnLetterTranslated = "B"; break; }
                    2 { $columnLetterTranslated = "C"; break; }
                    3 { $columnLetterTranslated = "D"; break; }
                    4 { $columnLetterTranslated = "E"; break; }
                    5 { $columnLetterTranslated = "F"; break; }
                    6 { $columnLetterTranslated = "G"; break; }
                    7 { $columnLetterTranslated = "H"; break; }
                    8 { $columnLetterTranslated = "I"; break; }
                    9 { $columnLetterTranslated = "J"; break; }
                    10 { $columnLetterTranslated = "K"; break; }
                    11 { $columnLetterTranslated = "L"; break; }
                    12 { $columnLetterTranslated = "M"; break; }
                    13 { $columnLetterTranslated = "N"; break; }
                    14 { $columnLetterTranslated = "O"; break; }
                    15 { $columnLetterTranslated = "P"; break; }
                    16 { $columnLetterTranslated = "Q"; break; }
                    17 { $columnLetterTranslated = "R"; break; }
                    18 { $columnLetterTranslated = "S"; break; }
                    19 { $columnLetterTranslated = "T"; break; }
                    20 { $columnLetterTranslated = "U"; break; }
                    21 { $columnLetterTranslated = "V"; break; }
                    22 { $columnLetterTranslated = "W"; break; }
                    23 { $columnLetterTranslated = "X"; break; }
                    24 { $columnLetterTranslated = "Y"; break; }
                    25 { $columnLetterTranslated = "Z"; break; }
                }
                $arr_sheet1_columns.Add($columnLetterTranslated) | Out-Null;
            }       
        }

        $rowElements = $sheetData.row[$i].c; #grab data of all cells in that row
        foreach ($element in $rowElements)
        {
            $columnLetter = $element.r -replace '[^a-zA-Z-]',''
            for ($k = 0; $k -lt $arr_sheet1_columns.Count; $k++)
            {
                $column = $arr_sheet1_columns[$k];
                if ($column.StartsWith($columnLetter))
                {
                    $columnTemp = $column.ToString() + "|" + $element.v;
                    $arr_sheet1_columns.Remove($column);
                    $arr_sheet1_columns.Insert($k, $columnTemp);
                    $stop = "here";
                }
            }
        }
    }

    $columnListOfTuples = New-Object System.Collections.ArrayList; # has column letter, than row number, then cell value
    foreach ($columnMapped in $arr_sheet1_columns)
    {
        $index = $columnMapped.IndexOf("|");
        $columnMappedLetter = $columnMapped.Remove($index);

        $columnIndividualValues = $columnMapped.Remove(0, $index);
        $columnIndividualValuesSplit = $columnIndividualValues.Split("|");

        $rowNumber = 0;
        foreach($columnIndividualValueSplit in $columnIndividualValuesSplit)
        {   
            $cellTuple = [tuple]::create([string]$columnMappedLetter, [int]$rowNumber, [string]$columnIndividualValueSplit);
            $columnListOfTuples.Add($cellTuple) | Out-Null;
            $rowNumber ++;
        }
    }

    #Endregion /Map shared columns



    # Here you have all of the data and you can parse it however you want, below is an example of that

    #Region Parse the extracted data

    $rowsForParsing = New-Object System.Collections.ArrayList;
    foreach ($arr_sheet1_row in $arr_sheet1_rows)
    {
        $rowNumber = $arr_sheet1_row.Item1;
        $cellValues = New-Object System.Collections.ArrayList; 
        $rowTemp = "";

        $indexer = 0;
        foreach ($rowValue in $arr_sheet1_row.Item2)
        {
            $valueSplit = $rowValue.Replace(" ", "");
            $valueSplit = $valueSplit.Replace("(", "");
            $valueSplit = $valueSplit.Replace(")", "");
            $valueSplit = $valueSplit.Split(",");

            $cellLocation = "";
            $cellLocation = $valueSplit[0];
            $cellSType = "";
            $cellSType = $valueSplit[1];
            $cellStringRepresentationValue = "";
            $cellStringRepresentationValue = $valueSplit[2];
            $translatedValue = "";

            if ($cellStringRepresentationValue -ne "")
            {
                switch ($cellSType)
                {
                    "1" { $translatedValue = getStringMappedValue($cellStringRepresentationValue); break;}  # represents column headers
                    "2" { $translatedValue = [datetime]::FromOADate($cellStringRepresentationValue); break;}  # number of days from January 1, 1900
                    "3" { $translatedValue = getStringMappedValue($cellStringRepresentationValue); break;}  # mapped value to strings table
                    "4" {$translatedValue = getStringMappedValue($cellStringRepresentationValue); break;}  # mapped value to strings table
                    "5" { $translatedValue = getStringMappedValue($cellStringRepresentationValue); break;}  # mapped value to strings table
                    "6" { Write-Host "#6: " $rowValue; break;}  # not sure what 6 represents yet
                    "7" { Write-Host "#7: " $rowValue; break;}  # not sure what 7 represents yet
                    "8" { Write-Host "#8: " $rowValue; break;}  # not sure what 8 represents yet
                    "9" { Write-Host "#9: " $rowValue; break;}  # not sure what 9 represents yet
                    "10" { Write-Host "#10: " $rowValue; break;}  # not sure what 10 represents yet
                    "11" { Write-Host "#11: " $rowValue; break;}  # not sure what 11 represents yet     
                }
            }

            if (($rowTemp -eq "") -and ($indexer -eq 0))
            {
                $rowTemp = "$translatedValue";
            }
            else
            {       
                $rowTemp = "$rowTemp" + "|" + "$translatedValue";
            }   

            $indexer++;
        }

        if ($rowNumber -eq 22)
        {
            $stop = "here";
        }

        $rowsForParsing.Add("$rowNumber|$rowTemp") | Out-Null;
    }

    $fileInformation = New-Object System.Collections.ArrayList;
    $topRow = $rowsForParsing[1];
    foreach ($rowParsed in $rowsForParsing)
    {
        $rowParsedSplit = $rowParsed.Split("|");    
        $date = $rowParsedSplit[1];

        $pass = $true;
        try
        {
            $dtVal = get-date $date;
        }
        catch
        {
            $pass = $false;
        }

        if ($pass -eq $true)
        {
            if ((get-date) -gt (get-date $date))
            {
                $dateApplicableRow = $rowParsed;
            }
            else
            {
                Write-Host "Top row is $topRow";
                Write-Host "This weeks row is $dateApplicableRow";
                $fileInformation.Add($topRow);
                $fileInformation.Add($dateApplicableRow);
                break;
            }
        }
    }

    #Endregion /Parse the extracted data

#EndRegion /Read Xlsx file

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用protobuf-net或其他序列化程序序列化第三方类型? - How can I serialize a 3rd party type using protobuf-net or other serializers? 生成 excel 没有第 3 方组件的合并单元格? - Generate excel Merged cells without 3rd party component? 在 .NET 中刻录 A(视频)DVD,无需第三方库 - Burn A (Video) DVD In .NET Without 3rd Party Libraries 为没有身份的 .net 核心创建定制的第三方授权 - Creating besopoke 3rd party authoirsation for .net core without Identiy 如何在不等待 ASP.NET API 中的响应的情况下处理多个 3rd 方 API - How to process multiple 3rd party APIs without waiting for their response in ASP.NET API 在没有第三方控件的情况下,如何在Win Form [.Net]中下钻报表? - How do you do drill down reporting in a Win Form [.Net] without 3rd party controls? 如何在不使用第三方工具的情况下通过 C# windows 应用程序以编程方式发送电子邮件时禁用 Outlook 安全警告 - How to disable outlook security warning while sending emails programmatically through C# windows application without using 3rd party tools 是否可以在不使用HtmlAgilityPack之类的第三方库的情况下抓取HTML类? - Can I scrape an HTML class without using a 3rd party library like HtmlAgilityPack? 如何从C#锁定DB文件,以便第三方软件无法访问它们 - How to lock DB files from C# so that 3rd party softwares can't access them 我如何验证第三方用户访问我的Web API asp.net? - how do i authenticate 3rd party user to access my web API asp.net?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM