简体   繁体   中英

On converting the UFT-8 xml to Unicode in Powershell, $encoding attribute value is showing bigEndianUnicode in the output xml, I want UTF-16 there

Getting this line in output file after converting UTF-8 to Unicode

<?xml version="1.0" encoding="bigEndianUnicode"?>

But I need below line in the xml

<?xml version="1.0" encoding="UTF-16"?>

Assuming you're working with [xml] type, you can set encoding of a XML file as follows:

[xml] $xmlData = '<example>XML</example>'

$fileName = 'C:\test.xml'

$settings = New-Object System.Xml.XmlWriterSettings

# Set encoding to UTF-16
$settings.Encoding = [System.Text.Encoding]::Unicode

$xmlWriter = [System.Xml.XmlWriter]::Create($fileName, $settings)

$xmlData.Save($xmlWriter)

$xmlWriter.Close()

Giorgi Chakhidze's helpful answer shows a proper, XML API -based way to produce an XML file with a given encoding that is also reflected in the output file's XML declaration.

However, it sounds like you've used plain-text processing to transcode files from UTF-8 to "Unicode" (UTF-16LE), and must now adapt these files' XML declarations to match the new encoding.

The following shows a solution for a single file.xml file (it assumes that file.xml has a "Unicode" (UTF-16LE) BOM , so that Get-Content interprets its encoding correctly):

(Get-Content -Raw -LiteralPath file.xml) -replace '(?<=^.+ encoding=")[^"]+', 'utf-16' |
  Set-Content -NoNewLine -Literal Path file.xml

However, it's unclear how your transcoded-from-UTF-8 files ever ended up with encoding="bigEndianUnicode" in their XML declaration.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM