简体   繁体   English

Powershell - 将 XML 转换为 CSV

[英]Powershell - convert XML to CSV

I was able to convert XML to CSV by using the following code:我能够使用以下代码将 XML 转换为 CSV :

    #read from file
[xml]$inputFile = Get-Content "c:\pstest\test.xml"
#export xml as csv
$inputFile.Transaction.ChildNodes | Export-Csv "c:\pstest\test.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8

It works if the files contain only one root node with one type of child nodes, for example:如果文件仅包含一个具有一种类型子节点的根节点,则它可以工作,例如:

<?xml version="1.0" encoding="UTF-8"?>
<Transaction>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>1</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515552017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>2</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515622017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>3</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515972017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
</Transaction>

The output would be like: output 就像:

    "RecordID";"SequenceNumber";"TransactionType";"ActionCode";"TransactionID";"SellerCode";"BuyerCode";"TransactionReference";"TransactionDescription1";"TransactionDescription2";"DocumentType";"DocumentNumber";"DocumentDate";"DocumentAmount";"CurrencyCode";"TransactionAmount";"TransactionDueDate";"AdditionalInformation1";"AdditionalInformation2";"HashCode"
"02";"1";"01";"01";"17500515552017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"
"02";"2";"01";"01";"17500515622017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"
"02";"3";"01";"01";"17500515972017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"

Which is great.这是伟大的。

However, the input file in reality has a "header line" information, the TXNHEAD tag但是,现实中的输入文件有一个“标题行”信息,TXNHEAD标签

    <?xml version="1.0" encoding="UTF-8"?>
<Transaction>
    <TXNHEAD>
        <RecordID>01</RecordID>
        <FileName>001</FileName>
        <IntermediaryCode>19000033</IntermediaryCode>
        <ActualizationDate>20170314</ActualizationDate>
        <SequenceNumber>001</SequenceNumber>
        <NumberofRecords>3</NumberofRecords>
        <AmountofRecords>30000</AmountofRecords>
    </TXNHEAD>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>1</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515552017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>2</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515622017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
    <TXNDETAIL>
        <RecordID>02</RecordID>
        <SequenceNumber>3</SequenceNumber>
        <TransactionType>01</TransactionType>
        <ActionCode>01</ActionCode>
        <TransactionID>17500515972017001</TransactionID>
        <SellerCode>2200919TRY</SellerCode>
        <BuyerCode>KOCZER</BuyerCode>
        <TransactionReference> </TransactionReference>
        <TransactionDescription1> </TransactionDescription1>
        <TransactionDescription2> </TransactionDescription2>
        <DocumentType>01</DocumentType>
        <DocumentNumber>XXXXXXXXXXX</DocumentNumber>
        <DocumentDate>20170301</DocumentDate>
        <DocumentAmount>10000</DocumentAmount>
        <CurrencyCode>949</CurrencyCode>
        <TransactionAmount>10000</TransactionAmount>
        <TransactionDueDate>20170505</TransactionDueDate>
        <AdditionalInformation1> </AdditionalInformation1>
        <AdditionalInformation2> </AdditionalInformation2>
        <HashCode>XXXXXXXX</HashCode>
    </TXNDETAIL>
</Transaction>

When applying the same code, I get:应用相同的代码时,我得到:

    "RecordID";"FileName";"IntermediaryCode";"ActualizationDate";"SequenceNumber";"NumberofRecords";"AmountofRecords"
"01";"001";"19000033";"20170314";"001";"3";"30000"
"02";;;;"1";;
"02";;;;"2";;
"02";;;;"3";;

When I am trying this code instead to retrieve just the head:当我尝试使用此代码而不是仅检索头部时:

#read from file
[xml]$inputFile = Get-Content "c:\pstest\test.xml"
#export xml as csv
$inputFile.Transaction.TXNHEAD.ChildNodes | Export-Csv "c:\pstest\test.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8

I get:我得到:

"#text"
"01"
"001"
"19000033"
"20170314"
"001"
"3"
"30000"

What I am trying to achieve, is this output:我想要实现的是这个 output:

"RecordID";"FileName";"IntermediaryCode";"ActualizationDate";"SequenceNumber";"NumberofRecords";"AmountofRecords"
"01";"001";"19000033";"20170314";"001";"3";"30000"
"RecordID";"SequenceNumber";"TransactionType";"ActionCode";"TransactionID";"SellerCode";"BuyerCode";"TransactionReference";"TransactionDescription1";"TransactionDescription2";"DocumentType";"DocumentNumber";"DocumentDate";"DocumentAmount";"CurrencyCode";"TransactionAmount";"TransactionDueDate";"AdditionalInformation1";"AdditionalInformation2";"HashCode"
"02";"1";"01";"01";"17500515552017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"
"02";"2";"01";"01";"17500515622017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"
"02";"3";"01";"01";"17500515972017001";"2200919TRY";"KOCZER";"";"";"";"01";"XXXXXXXXXXX";"20170301";"10000";"949";"10000";"20170505";"";"";"XXXXXXXX"

What am I doing wrong?我究竟做错了什么?

The first object (or Select-Object etc.) in a pipeline defines the header for output no matter if it's file or console output. 管道中的第一个对象(或Select-Object等)定义输出的标头,无论它是文件还是控制台输出。

What you could do is convert them to csv in two rounds and add it to the same file. 您可以做的是在两轮中将它们转换为csv并将其添加到同一文件中。 Ex: 例如:

$inputFile.Transaction.TXNHEAD | ConvertTo-Csv -NoTypeInformation -Delimiter ";" | Set-Content -Path "c:\pstest\test.csv" -Encoding UTF8
$inputFile.Transaction.TXNDETAIL | ConvertTo-Csv -NoTypeInformation -Delimiter ";" | Add-Content -Path "c:\pstest\test.csv" -Encoding UTF8

You can also combine them like this: 你也可以像这样组合它们:

$inputFile.Transaction.TXNHEAD, $x.Transaction.TXNDETAIL |
ForEach-Object { $_ | ConvertTo-Csv -NoTypeInformation -Delimiter ";" } |
Set-Content -Path "c:\pstest\test.csv" -Encoding UTF8

Here's my Code for this: 这是我的代码:

# ============================================================================= 
#  
# NAME:xml2csv.ps1
#  
# AUTHOR: 
# THANKS TO: Rick Sheeley (original snippet on STack Overflow) 
# DATE  : 
#  
# COMMENT:  
# Send large XML with multiple children and Attributes to CSv for analysis
#
# Note: For versions 3.0 or newer only
# ============================================================================= 

function Get-Attributes([Object]$pnode)
{

    if($pnode.HasAttributes) {

        foreach($attr in $pnode.Attributes) {

            $xattString+= $attr.Name + ":" + $attr."#text" + ","

        }

    }

    else {

            $xattString = $pnode.nNode + ": No Attributes,"

    }

    return $xattString
}

function Get-XmlNode([ xml ]$XmlDocument, [string]$NodePath, [string]$NamespaceURI = "", [string]$NodeSeparatorCharacter = '.')
{
    # If a Namespace URI was not given, use the Xml document's default namespace.
    if ([string]::IsNullOrEmpty($NamespaceURI)) { $NamespaceURI = $XmlDocument.DocumentElement.NamespaceURI }   

    # In order for SelectSingleNode() to actually work, we need to use the fully qualified node path along with an Xml Namespace Manager, so set them up.
    $xmlNsManager = New-Object System.Xml.XmlNamespaceManager($XmlDocument.NameTable)
    $xmlNsManager.AddNamespace("ns", $NamespaceURI)
    $fullyQualifiedNodePath = "/ns:$($NodePath.Replace($($NodeSeparatorCharacter), '/ns:'))"

    # Try and get the node, then return it. Returns $null if the node was not found.
    $node = $XmlDocument.SelectSingleNode($fullyQualifiedNodePath, $xmlNsManager)
    return $node
}

cls
$fin = "<Filepath>\<myFile>.xml"
$fout = "<Filepath>\<myFile>.csv"

Remove-Item $fout -ErrorAction SilentlyContinue

[xml]$xmlContent = get-content $fin
$row=0
$COMMA=","
$pNode = "ROOT"

# Replace all "MyTopNode" with your top node...

$nNode = "MyTopNode"

$xmlArray  = @(
    [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

$xmlArray[$row].Row = $row
$xmlArray[$row].Parent = $pNode
$xmlArray[$row].Node = $nNode
$xmlArray[$row].Attribute = ""
$xmlArray[$row].ItemType = "Root"
$xmlArray[$row].Value = $attr."#text"
$row++

if($xmlContent.MyTopNode.HasAttributes) {

    foreach($attr in $xmlContent.MyTopNode.Attributes) {

        $xmlArray += @(
            [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

        $xmlArray[$row].Row = $row
        $xmlArray[$row].Parent = $pNode
        $xmlArray[$row].Node = $nNode
        $xmlArray[$row].Attribute = $attr.LocalName
        $xmlArray[$row].ItemType = "Attribute"
        $xmlArray[$row].Value = $attr."#text"
        $row++

    }

}

# Begin TRY

try {

    foreach($node in $xmlContent.MyTopNode.ChildNodes) {

        $pNode = "MyTopNode"

        $nNode = $node.LocalName

        $xmlArray += @(
            [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

        $xmlArray[$row].Row = $row
        $xmlArray[$row].Parent = $pNode
        $xmlArray[$row].Node = $nNode
        $xmlArray[$row].Attribute = ""
        $xmlArray[$row].ItemType = "Root"
        $xmlArray[$row].Value = $attr."#text"
        $row++

        if($nNode.HasAttributes) {

            foreach($attr in $node.Attributes) {

                $xmlArray += @(
                    [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

                $xmlArray[$row].Row = $row
                $xmlArray[$row].Parent = $pNode
                $xmlArray[$row].Node = $nNode
                $xmlArray[$row].Attribute = $attr.LocalName
                $xmlArray[$row].ItemType = "Attribute"
                $xmlArray[$row].Value = $attr."#text"
                $row++

            }

        }

        foreach($sNode in $node.ChildNodes) {

            $pNode = $nNode
            $snNode = $sNode.LocalName

            $xmlArray += @(
                [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

            $xmlArray[$row].Row = $row
            $xmlArray[$row].Parent = $pNode
            $xmlArray[$row].Node = $snNode
            $xmlArray[$row].Attribute = ""
            $xmlArray[$row].ItemType = "Root"
            $xmlArray[$row].Value = $attr."#text"
            $row++

            if($sNode.HasAttributes) {

                foreach($attr in $sNode.Attributes) {

                    $xmlArray += @(
                        [pscustomobject]@{Row= $row;Parent=$pNode;Node=$nNode;Attribute='';ItemType='Root';Value=''})

                    $xmlArray[$row].Row = $row
                    $xmlArray[$row].Parent = $pNode
                    $xmlArray[$row].Node = $snNode
                    $xmlArray[$row].Attribute = $attr.LocalName
                    $xmlArray[$row].ItemType = "Attribute"
                    $xmlArray[$row].Value = $attr."#text"
                    $row++

                }

            }
        }
    }

    $xmlArray | SELECT Row,Parent,Node,Attribute,ItemType,Value | Export-CSV $fout -NoTypeInformation
}

# End TRY

# Begin Catch

Catch [System.Runtime.InteropServices.COMException]
{
    $ErrException = Format-ErrMsg -errmsg $_.Exception
    $ErrorMessage = $_.Exception.Message
    $ErrorID = $_.FullyQualifiedErrorId
    $line = $_.InvocationInfo.ScriptLineNumber

    Write-Host                           "  "
    Write-Host                           "  "
    Write-Host -ForegroundColor DarkMagenta  ""
    Write-Host -ForegroundColor Magenta      "==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!=="
    Write-Host -ForegroundColor DarkMagenta  ""
    Write-Host -ForegroundColor DarkCyan     "Details:"  
    Write-Host -ForegroundColor White        "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Cyan         "`t  Module:        $modname"
    Write-Host -ForegroundColor Cyan         "`t Section:        $sFunc"
    Write-Host -ForegroundColor Cyan         "`t On Line:        $line"
    Write-Host -ForegroundColor Cyan         "`t File:           $fSearchFile | Search will be skipped!!"
    Write-Host -ForegroundColor White        "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor DarkCyan      "Exception Message:"  
    Write-Host -ForegroundColor White       "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Yellow       "`t ShortMessage:   $ErrorMessage"       
    Write-Host -ForegroundColor Yellow       "`t ErrorID:        $ErrorID"
    Write-Host -ForegroundColor White        "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Magenta      "$ErrException"  
    Write-Host -ForegroundColor White        "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor DarkMagenta  "========================================================================================================================================== "
    Write-Host -ForegroundColor Yellow       "This File will be added to the Skip list. Please restart script $sScriptName ...." 
    Write-Host -ForegroundColor DarkMagenta  "========================================================================================================================================== "
    Write-Host                           "  "
    Write-Host                           "  "
}

Catch
{
    $ErrException = Format-ErrMsg -errmsg $_.Exception
    $ErrorMessage = $_.Exception.Message
    $ErrorID = $_.FullyQualifiedErrorId
    $line = $_.InvocationInfo.ScriptLineNumber

    Write-Host                           "  "
    Write-Host                           "  "
    Write-Host -ForegroundColor DarkRed  "================================================================================================================================================="
    Write-Host -ForegroundColor Red      "==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!==!!Error!!=="
    Write-Host -ForegroundColor DarkRed  "================================================================================================================================================="
    Write-Host -ForegroundColor DarkCyan "Details:"  
    Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Cyan     "`t  Module:        $modname"
    Write-Host -ForegroundColor Cyan     "`t Section:        $sFunc"
    Write-Host -ForegroundColor Cyan     "`t On Line:        $line"
    Write-Host -ForegroundColor Cyan     "`t File:           $fSearchFile"
    Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor DarkCyan "Exception Message:"  
    Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Yellow      "`t ShortMessage:   $ErrorMessage"       
    Write-Host -ForegroundColor Magenta  "`t ErrorID:        $ErrorID"
    Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
    Write-Host -ForegroundColor Red      "$ErrException"  
    Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "

# Show etended error info if in debug mode.

    if ($extDebug -eq $true) {

        Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
        Write-Host -ForegroundColor Red      "Extended Debugging info"  

        Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "
        $error[0].Exception | Format-List * -Force
        Write-Host -ForegroundColor White    "---------------------------------------------------------------------------------------------------------------------------------------------------- "

    }

    Write-Host -ForegroundColor DarkRed  "==================================================================================================================================================== "
    Write-Host -ForegroundColor DarkRed  "==================================================================================================================================================== "
    Write-Host                           "  "
    Write-Host                           "  "

    Break

# End Catch

}

# Begin Finally

Finally
{
    "finis."
    Exit 0

# End Finally

}

User2573264's comment only provides a limited amount of data on xml with complex elements and attributes (5 columns of data, where my xml should have 27 columns of data. User2573264 的评论仅提供了 xml 上有限数量的数据,具有复杂的元素和属性(5 列数据,其中我的 xml 应该有 27 列数据。

Guess it is back to the drawing board for me.猜猜它又回到了我的绘图板上。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM