简体   繁体   中英

.Net XmlWriter - unexpected encoding is confusing me

Environment is VS2008, .Net 3.5

The following C# code (note the specified encoding of UTF8)

XmlWriterSettings settings = new XmlWriterSettings ();
StringBuilder sb  = new StringBuilder();
settings.Encoding = System.Text.Encoding.UTF8;
settings.Indent   = false;
settings.NewLineChars = "\n";
settings.ConformanceLevel =  System.Xml.ConformanceLevel.Document;

XmlWriter writer = XmlWriter.Create (sb, settings);
{
   // Write XML data.
   writer.WriteStartElement ("CCHEADER");
   writer.WriteAttributeString ("ProtocolVersion", "1.0.0");
   writer.WriteAttributeString ("ServerCapabilities", "0x0000000F");
   writer.WriteEndElement ();
   writer.Flush ();
}

Actually generates the XML (>< omitted because SO barfs on them):

?xml version="1.0" encoding="utf-16"?
CCHEADER ProtocolVersion="1.0.0" ServerCapabilities="0x0000000F" /

Why do I get the wrong encoding generated here ? What am I doing wrong ?

I suspect it's because it's writing to a StringBuilder, which is inherently UTF-16. An alternative to get round this is to create a class derived from StringWriter, but which overrides the Encoding property.

I believe I've got one in MiscUtil - but it's pretty trivial to write anyway. Something like this:

public sealed class StringWriterWithEncoding : StringWriter
{
    private readonly Encoding encoding;

    public StringWriterWithEncoding (Encoding encoding)
    {
        this.encoding = encoding;
    }

    public override Encoding Encoding
    {
        get { return encoding; }
    }
}

A .Net String is encoded in Unicode (UTF-16). I expect this is the source of your encoding problems because you're writing to a StringBuilder.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM