简体   繁体   English

转换为XML时,SQL Server是否添加字节顺序标记?

[英]Does SQL Server add a byte order mark when casting to XML?

I have this C# method that is meant to ignore the byte order mark when serializing to XML: 我有此C#方法,该方法在序列化为XML时将忽略字节顺序标记:

public static string SerializeAsXml(this object dataToSerialize)
{
   if (dataToSerialize == null) return null;

   using (var stringwriter = new StringWriter())
   {
      var serializer = new XmlSerializer(dataToSerialize.GetType());

      serializer.Serialize(stringwriter, dataToSerialize);

      var xml = stringwriter.ToString();

      var utf8 = new UTF8Encoding(false);

      var bytes = utf8.GetBytes(xml);

      xml = utf8.GetString(bytes);

      return xml;
   }
}

The result is being passed to a stored procedure and cast to XML like this: @EventMessage AS XML 结果将传递到存储过程并转换为XML,如下所示: @EventMessage AS XML

This stored procedure adds this as a message on a service broker queue. 此存储过程将其作为消息添加到服务代理队列中。

But, when testing, the BOM is still present when retrieved from the queue. 但是,在测试时,从队列中检索出BOM时仍然存在。

Does SQL Server add a BOM itself when casting? 投射时,SQL Server是否会自己添加BOM表? And it so, is there a way to prevent this? 因此,有没有办法防止这种情况发生?

EDIT: 编辑:

I retrieve the value from the queue with this query in a fitnesse test: 我在fitnesse测试中使用此查询从队列中检索值:

var sqlSelectCommand =
            $@"SELECT message_type_name, message_body, casted_message_body = 
            CASE message_type_name WHEN 'X' 
              THEN CAST(message_body AS NVARCHAR(MAX)) 
              ELSE message_body 
            END 
            FROM {QueueName} WITH (NOLOCK)";

This is read with this: 与此一起阅读:

var castedMessageBody = reader["casted_message_body"].ToString();

And I know the BOM is still present because the test needs this to pass: 而且我知道BOM仍然存在,因为测试需要通过该测试:

   if (castedMessageBody.StartsWith(_byteOrderMarkUtf8, StringComparison.Ordinal))
   {
       castedMessageBody = castedMessageBody.Remove(0, _byteOrderMarkUtf8.Length);
   }

Technically I don't think it does add a BOM when casting as XML since : 从技术上讲,我不认为在将其转换为XML时不会添加BOM, 因为

The data is stored in an internal representation that preserves the XML content of the data. 数据以内部表示形式存储,该内部表示形式保留数据的XML内容。 This internal representation includes information about the containment hierarchy, document order, and element and attribute values. 此内部表示包括有关包含层次结构,文档顺序以及元素和属性值的信息。 Specifically, the InfoSet content of the XML data is preserved 具体来说,将保留XML数据的InfoSet内容

Since the BOM is an artefact of string encodings of XML and not part of the XML Infoset, I don't think a BOM is stored. 因为BOM是伪造的XML 字符串编码 ,而不是XML Infoset的一部分,所以我认为不存储BOM。

However , if you cast the XML data into a binary or string representation in SQL Server, it appears to prefer a UTF-16 encoding with BOM as the representation you receive. 但是 ,如果将XML数据转换为SQL Server中的二进制或字符串表示形式,则似乎更喜欢使用带有BOM的UTF-16编码作为接收的表示形式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM