I'm reading the following XML text from a file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<SampleXML>...</SampleXML>
and I load the UTF-8 text using IXMLDomDocument::loadXML. Then I manipulate the XML and call IXMLDomDocument::Getxml() to get a _bstr_t of the modified XML. this _bstr_t looks as follows:
<?xml version="1.0" standalone="yes"?>
<ModifiedSampleXML>...</ModifiedSampleXML>
The encoding="UTF-8" attribute in the header is gone. However, if I call IXMLDomDocument::save(FileName) to save the XML to a file, when I open the file I see that the encoding="UTF-8" attribute is preserved.
Why the encoding="UTF-8" attribute is not there when I call Getxml()? How do I tell MSXML to always preserve this attribute? (not only upon save)
The attribute "encoding='UTF-8'" is removed because Getxml()
returns the loaded XML in wide character (16-bit) string. You can verify this by looking into the memory held by the returned _bstr_t
.
It would be incorrect for MSXML to preserve the attribute that says the encoding is 8-bits when in fact it is 16-bits.
If however you load a unicode xml file having "encoding='UTF-16'" attribute, you will see that Getxml()
will not remove the attribute.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.