简体   繁体   English

如何在不使用XmlReader取消引用实体的情况下读取XML节点的文本元素

[英]How to read the text element of an XML node without dereferencing entities using XmlReader

I am attempting to read a XML document containing elements like the data mentioned below. 我试图读取一个XML文档,其中包含类似下面提到的数据的元素。

Accessing the text node via reader.Value , reader.ReadContentAsString() , reader.ReadContentAsObject() results in the value read being truncated to the last ampersand, so in the case of the data below that would be ISO^urn:ihe:iti:xds:2013:referral. 通过reader.Valuereader.ReadContentAsString()reader.ReadContentAsObject()访问文本节点会导致读取的值被截断为最后一个与号,因此在以下数据的情况下为ISO ^ urn:ihe:iti :XDS:2013:转诊。 Using XmlDocument the text nodes can be read properly so I am assuming there has to be a way to make this work using the reader as well. 使用XmlDocument可以正确读取文本节点,因此我假设也必须有一种使用阅读器进行此项工作的方法。

 <Slot name="urn:ihe:iti:xds:2013:referenceIdList">
              <ValueList>
                <Value>123456^^^&amp;orgID&amp;ISO^urn:ihe:iti:xds:2013:referral</Value>
                <Value>098765^^^&amp;orgID&amp;ISO^urn:ihe:iti:xds:2013:referral</Value>
              </ValueList>
            </Slot>


Clarification Edit 澄清度编辑

After asking the question I was able to determine my issue came from creating an XmlReader from a XPathNavigator instance created from a MessageBuffer executing in the context of a WCF service call. 提出问题后,我能够确定我的问题来自从XPathNavigator实例创建XmlReader ,该实例是从在WCF服务调用的上下文中执行的MessageBuffer创建的。 Thus @DarkGray's answer was correct for the original question but did not really address the root of the problem. 因此,@ DarkGray的答案对于原始问题是正确的,但并未真正解决问题的根源。 I provided a second answer which addressed my corner case. 我提供了第二个答案,解决了我的极端情况。

 System.ServiceModel.Channels.Message message; // the inbound SOAP message var buffer = message.CreateBufferedCopy(11 * 1024 * 1024); var navigator = buffer.CreateNavigator(); var reader = navigator.ReadSubtree(); // advance the reader to the text element // // `reader.Value` now produces ISO^urn:ihe:iti:xds:2013:referral 

Answer: reader.Value 答: reader.Value

Output: 输出:

123456^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral
098765^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral

Example: 例:

public static void Execute()
{
  var xml = @"
    <Slot name='urn:ihe:iti:xds:2013:referenceIdList'>
      <ValueList>
        <Value>123456^^^&amp;orgID&amp;ISO^urn:ihe:iti:xds:2013:referral</Value>
        <Value>098765^^^&amp;orgID&amp;ISO^urn:ihe:iti:xds:2013:referral</Value>
      </ValueList>
    </Slot>
  ";
  var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml));
  for (; ; )
  {
    if (!reader.Read())
      break;
    if (reader.NodeType == System.Xml.XmlNodeType.Text)
      Console.WriteLine(reader.Value);
  }
}

My question ended up being too broad as the incorrect behavior (truncation when using reader.Value ) only manifest when the code was executing within the context of a WCF call. 我的问题最终变得太广泛了,因为不正确的行为(使用reader.Value时被截断)仅在WCF调用的上下文中执行代码时才表现出来。 It worked perfectly fine when exercising the logic of the containing class from a unit test. 当行使单元测试中包含类的逻辑时,它工作得很好。

So the salient setup can be reproduced as follows 因此,可以如下重现主要设置

The Failing Code 失败代码

System.ServiceModel.Channels.Message message; // the inbound SOAP message
var buffer = message.CreateBufferedCopy(11 * 1024 * 1024);
var navigator = buffer.CreateNavigator();
var reader = navigator.ReadSubtree();
// advance the reader to the text element
//
// `reader.Value` now produces ISO^urn:ihe:iti:xds:2013:referral

Once this reader instance was created then any XmlText node read from it produced the truncated value when the text contained an character entity reference. 创建此阅读器实例后,当文本包含字符实体引用时,从其中读取的任何XmlText节点都会产生截断的值。 The only way I found that allow for the original value to be read in high-fidelity was to eschew the use of the XPathNavigator completely and instead take the hit of creating another Message instance. 我发现允许以高保真度读取原始值的唯一方法是完全避免XPathNavigator的使用,而是创建另一个Message实例。 Note, the fix uses the long way around to write the SOAP envelope to the stream as affected service is using MTOM encoding. 注意,由于受影响的服务正在使用MTOM编码,因此该修补程序使用了很长的路来将SOAP信封写入流中。 Writing to the stream directly from the MessageBuffer resulted in the MIME fences being written out. 直接从MessageBuffer写入流会导致MIME隔离栅被写出。

The Fix 修复

System.ServiceModel.Channels.Message message; // the inbound SOAP
var buffer = message.CreateBufferedCopy(MaxMessageSize);
var message = buffer.CreateMessage();
using (MemoryStream stream = new MemoryStream())
using (XmlWriter writer = XmlWriter.Create(stream))
{
    message.WriteMessage(writer);
    writer.Flush();
    stream.Position = 0;

    using (XmlReader reader = XmlReader.Create(stream))
    {
        // business logic goes here
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM