简体   繁体   中英

XmlWriter Async operations fail with XmlWriterSettings.OutputMethod = Html

When creating an XmlWriter with XmlWriterSettings.OutputMethod = OutputMethod.Html , async operations fail. When creating the same with OutputMethod.AutoDetect (the default), async operations succeed.

Failing code (with fiddle ):

var transform = new XslCompiledTransform();
using var reader = XmlReader.Create(new StringReader(@"
  <xsl:stylesheet version=""1.0"" xmlns:xsl=""http://www.w3.org/1999/XSL/Transform"">
    <xsl:output method=""html"" indent=""yes"" doctype-system=""html""/>
    <xsl:template match=""/"">
      <bar/>
    </xsl:template>
  </xsl:stylesheet>"));
transform.Load(reader);

var settings = transform.OutputSettings.Clone();
settings.CloseOutput = false;
settings.Async = true;

using var stream = new MemoryStream();
using (var writer = XmlWriter.Create(stream, settings))
{
    await writer.WriteStartDocumentAsync();
    await writer.WriteStartElementAsync(null, "foo", null);
    await writer.WriteEndElementAsync();
    await writer.WriteEndDocumentAsync();
}
stream.Position = 0;
var content = new StreamReader(stream).ReadToEnd();
Assert.Contains("foo", content);

with the stack trace:

Message: 
System.NotImplementedException : The method or operation is not implemented.

  Stack Trace: 
XmlWriter.WriteStartElementAsync(String prefix, String localName, String ns)
XmlWellFormedWriter.WriteStartElementAsync_NoAdvanceState(String prefix, String localName, String ns)
XmlWellFormedWriter.WriteStartElementAsync(String prefix, String localName, String ns)
XmlAsyncCheckWriter.WriteStartElementAsync(String prefix, String localName, String ns)

Working code (with working fiddle ):

var settings = new XmlWriterSettings();
settings.CloseOutput = false;
settings.Async = true;

using var stream = new MemoryStream();
using (var writer = XmlWriter.Create(stream, settings))
{
    await writer.WriteStartDocumentAsync();
    await writer.WriteStartElementAsync(null, "foo", null);
    await writer.WriteEndElementAsync();
    await writer.WriteEndDocumentAsync();
}
stream.Position = 0;
var content = new StreamReader(stream).ReadToEnd();
Assert.Contains("foo", content);

Inspecting a variety of things in debug mode, both code paths appear to use an System.Xml.XmlAsyncCheckWriter under the hood.

Interestingly, it's not the OutputMethod causing this, but the doctype-system . Remove the attribute, and your async calls will magically work.

I can show you what's happening, but can't tell you WHY they chose to do it this way.


Firstly, the writer is created by XmlWriterSettings.CreateWriter(Stream) . Cutting down all the fluff, it goes like this:

internal XmlWriter CreateWriter(Stream output)
{
    XmlWriter writer;
    if (Encoding.WebName == "utf-8") {
        switch (OutputMethod) {
            case XmlOutputMethod.Html:
                writer= new HtmlUtf8RawTextWriter(output, this);
                break;
        }
    }

    // Wrap with Xslt/XQuery specific writer if needed;
    // XmlOutputMethod.AutoDetect writer does this lazily when it creates the underlying Xml or Html writer.
    if (OutputMethod != XmlOutputMethod.AutoDetect) {
        if (IsQuerySpecific) {
            // Create QueryOutputWriter if CData sections or DocType need to be tracked
            writer = new QueryOutputWriter((XmlRawWriter)writer, this);
        }
    }

    // wrap with well-formed writer
    writer = new XmlWellFormedWriter(writer, this);

    if (_useAsync)
        writer = new XmlAsyncCheckWriter(writer);

    return writer;
}

So in the end, you get an onion/ogre layers of

XmlAsyncCheckWriter(
    XmlWellFormedWriter(
        QueryOutputWriter(
            HtmlUtf8RawTextWriter)))

When you do your Write...Async() calls, you'd expect it to cascade from the outer Writer a all the way down to the deepest level in HtmlUtf8RawTextWriter - which does have the async calls you want.

UNFORTUNATELY, the QueryOutputWriter wrapper does NOT delegate the Async calls to the inner writer, and is actually the one throwing a NotImplementedException . Is it a bug? Or a deliberate choice? I don't know.

If you don't need a DOCTYPE, and don't use CDATA in your output (both handled by our problematic QueryOutputWriter ), simply removing the doctype-system from your XSL will solve your problem. It would result in the following IsQuerySpecific to be false , preventing the undesirable wrapping.

private bool IsQuerySpecific => 
    CDataSectionElements.Count != 0
    || _docTypePublic != null
    || _docTypeSystem != null
    || _standalone == XmlStandalone.Yes;

...

if (IsQuerySpecific)
    xmlWriter = new QueryOutputWriter((XmlRawWriter)xmlWriter, this);

If you do need DOCTYPE/CDATA, then it will be a fun exercise of reimplementing some of the layers and overriding the functions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM