RESTSharp在反序列化XML时遇到问题，包括字节顺序标记？

Question

There is a public webservice which I want to use in a short C# Application: http://ws.parlament.ch/ 我想在一个简短的C＃应用程序中使用一个公共Web服务： http ： //ws.parlament.ch/

The returned XML from this webservice has a "BOM" at the beginning, which causes RESTSharp to fail the deserializing of the XML with the following error message: 从此Web服务返回的XML在开头有一个“BOM”，这会导致RESTSharp无法通过以下错误消息对XML进行反序列化：

Error retrieving response. 检索响应时出错。 Check inner details for more info. 查看内部细节以获取更多信息。 ---> System.Xml.XmlException: Data at the root level is invalid. ---> System.Xml.XmlException： 根级别的数据无效。 Line 1, position 1. at System.Xml.XmlTextReaderImpl.Throw(Exception e) System.Xml.XmlTextReaderImpl.Throw的第1行，第1位 （例外e）
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg) at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace() at System.Xml.XmlTextReaderImpl.ParseDocumentContent() at System.Xml.XmlTextReaderImpl.Read() at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options) at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options) 在System.Xml.XmlTextReaderImpl.Throw（String res，String arg）的System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace（）中的System.Xml.XmlTextReaderImpl.ParseDocumentContent（）处于System.Xml的System.Xml.XmlTextReaderImpl.Read（）处。 System.Xml.Linq.XDocument.Parse中的Linq.XDocument.Load（XmlReader reader，LoadOptions选项）（String text，LoadOptions options）
at System.Xml.Linq.XDocument.Parse(String text) at RestSharp.Deserializers.XmlDeserializer.Deserialize[T](IRestResponse response) at RestSharp.RestClient.Deserialize[T](IRestRequest request, IRestResponse raw) 在RestSharp.RestClient.Deserialize [T]的RestSharp.Deserializers.XmlDeserializer.Deserialize [T]（IRestResponse响应）的System.Xml.Linq.XDocument.Parse（String text）处（IRestRequest请求，IRestResponse raw）
--- End of inner exception stack trace --- ---内部异常堆栈跟踪结束---

Here is an easy sample by using http://ws.parlament.ch/sessions?format=xml to get a List of ' Sessions ': 以下是使用http://ws.parlament.ch/sessions?format=xml获取“ 会话 ”列表的简单示例：

public class Session
{
    public int Id { get; set; }
    public DateTime? Updated { get; set; }
    public int? Code { get; set; }
    public DateTime? From { get; set; }
    public string Name { get; set; }
    public DateTime? To { get; set; }
}


static void Main(string[] args)
    {
        var request = new RestRequest();
        request.RequestFormat = DataFormat.Xml;
        request.Resource = "sessions";
        request.AddParameter("format", "xml");

        var client = new RestClient("http://ws.parlament.ch/");
        var response = client.Execute<List<Session>>(request);

        if (response.ErrorException != null)
        {
            const string message = "Error retrieving response.  Check inner details for more info.";
            var ex = new ApplicationException(message, response.ErrorException);
            Console.WriteLine(ex);
        }

        List<Session> test = response.Data;

        Console.Read();
    }

When I first manipulate the returned xml with Fiddler to remove the first 3 bits (the "BOM"), the above code works! 当我第一次使用Fiddler操作返回的xml来删除前3位（“BOM”）时，上面的代码可以正常工作！ May someone please help me to handle this directly in RESTSharp? 有人可以帮我直接在RESTSharp中处理这个吗？ What am I doing wrong? 我究竟做错了什么？ THANK YOU in advance! 先感谢您！

Answer 1

I found the Solution - Thank you @arootbeer for the hints! 我找到了解决方案 - 谢谢@arootbeer的提示！

Instead of wrapping the XMLDeserializer, you can also use the 'RestRequest.OnBeforeDeserialization' event from #RESTSharp. 您也可以使用#RESTSharp中的“RestRequest.OnBeforeDeserialization”事件，而不是包装XMLDeserializer。 So you just need to insert something like this after the new RestRequest() (see my initial code example) and then it works perfect! 所以你只需要在新的RestRequest（）之后插入这样的东西（参见我的初始代码示例）然后它就完美了！

request.OnBeforeDeserialization = resp =>
            {
                //remove the first ByteOrderMark
                //see: http://stackoverflow.com/questions/19663100/restsharp-has-problems-deserializing-xml-including-byte-order-mark
                string byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
                if (resp.Content.StartsWith(byteOrderMarkUtf8))
                    resp.Content = resp.Content.Remove(0, byteOrderMarkUtf8.Length);
            };

Answer 2

I had this same problem, but not specifically with RestSharp. 我遇到了同样的问题，但没有专门针对RestSharp。 Use this: 用这个：

var responseXml = new UTF8Encoding(false).GetString(bytes);

Original discussion: XmlReader breaks on UTF-8 BOM 原始讨论： XmlReader打破了UTF-8 BOM

Pertinent quote from the answer: 来自答案的相关引言：

The xml string must not (!) contain the BOM, the BOM is only allowed in byte data (eg streams) which is encoded with UTF-8. xml字符串不得（！）包含BOM，BOM仅允许在使用UTF-8编码的字节数据（例如流）中。 This is because the string representation is not encoded, but already a sequence of unicode characters. 这是因为字符串表示不是编码的，而是已经是一系列unicode字符。

Edit: Looking through their docs, it looks like the most straightforward way to handle this (aside from a GitHub issue) is to call the non-generic Execute() method and deserialize the response from that string. 编辑：通过他们的文档，看起来最简单的方法来处理这个（除了GitHub问题）是调用非泛型的Execute()方法并反序列化该字符串的响应。 You could also create an IDeserializer that wraps the default XML deserializer. 您还可以创建一个包装默认XML反序列化器的IDeserializer 。

Answer 3

The solution that @dataCore posted doesn't quite work, but this one should. @dataCore发布的解决方案不太有用，但是这个应该。

request.OnBeforeDeserialization = resp => {
    if (resp.RawBytes.Length >= 3 && resp.RawBytes[0] == 0xEF && resp.RawBytes[1] == 0xBB && resp.RawBytes[2] == 0xBF)
    {
        // Copy the data but with the UTF-8 BOM removed.
        var newData = new byte[resp.RawBytes.Length - 3];
        Buffer.BlockCopy(resp.RawBytes, 3, newData, 0, newData.Length);
        resp.RawBytes = newData;

        // Force re-conversion to string on next access
        resp.Content = null;
    }
};

Setting resp.Content to null is there as a safety guard, as RawBytes is only converted to a string if Content isn't already set to a value. 将resp.Content设置为null是作为安全防护，因为如果Content尚未设置为值，则RawBytes仅转换为字符串。

Answer 4

To make it work with RestSharp you can parse response content manually and remove all the "funny" characters coming before the '<'. 要使其与RestSharp一起使用，您可以手动解析响应内容并删除“<”之前的所有“有趣”字符。

var firstChar = responseContent[0];

// removing any 'funny' characters coming before '<'
while (firstChar != 60)
{
    responseContent= responseContent.Remove(0, 1);
    firstChar = responseContent[0];
}

XmlReader.Create(new StringReader(responseContent));

RESTSharp在反序列化XML时遇到问题，包括字节顺序标记？

问题描述

4 个解决方案

解决方案1
7 已采纳 2013-10-30 13:24:06

解决方案2
2 2013-10-29 16:32:12

解决方案3
1 2019-06-12 03:56:51

解决方案4
0 2017-08-21 00:08:27

RESTSharp在反序列化XML时遇到问题，包括字节顺序标记？

问题描述

4 个解决方案

解决方案1 7 已采纳 2013-10-30 13:24:06

解决方案2 2 2013-10-29 16:32:12

解决方案3 1 2019-06-12 03:56:51

解决方案4 0 2017-08-21 00:08:27

解决方案1
7 已采纳 2013-10-30 13:24:06

解决方案2
2 2013-10-29 16:32:12

解决方案3
1 2019-06-12 03:56:51

解决方案4
0 2017-08-21 00:08:27