简体   繁体   中英

Powershell - xml

I have an input XML file which contains normal HTML names for various characters eg Double Quote = " etc.

<Notes>Double Quote &quot; Single Quote &pos; Ampersand &amp;</Notes>

Before

<?xml version="1.0" encoding="UTF-8"?>
<OrganisationUnits>
  <OrganisationUnitsRow num="8">
    <OrganisationId>ACME24/7HOME</OrganisationId>
    <OrganisationName>ACME LTD</OrganisationName>
    <Notes>Double Quote &quot; Single Quote &pos; Ampersand &amp; </Notes>
    <Sector>P</Sector>
    <SectorDesc>Private Private &amp; Voluntary</SectorDesc>
  </OrganisationUnitsRow>
</OrganisationUnits>

After

<?xml version="1.0" encoding="UTF-8"?>
<OrganisationUnits>
  <OrganisationUnitsRow num="8">
    <OrganisationId>ACME24/7HOME</OrganisationId>
    <OrganisationName>ACME LTD</OrganisationName>
    <Notes>Double Quote " Single Quote ' Ampersand &</Notes>
    <Sector>P</Sector>
    <SectorDesc>Private Private & Voluntary</SectorDesc>
  </OrganisationUnitsRow>
</OrganisationUnits>

I am treating the file as XML and it gets processed OK, nothing very fancy.

$xml = [xml](Get-Content $path\$File)
foreach ($CMCAddressesRow in $xml.OrganisationUnits.OrganisationUnitsRow) {
    blah
    blah
}
$xml.Save("$path\$File")

When the output is saved all the HTML codes like &quot; get replaced by " . How can I retain the original HTML &quot; characters? And more importantly why is it happening.

What you're referring to is called "character entities". PowerShell converts them on import, so you can work with the actual characters these entities represent, and converts on export only what must be encoded in the XML file. Quotation characters don't need to be encoded in a node value, so they're not being encoded on export.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM