简体   繁体   中英

Convert a literal improperly encoded string (e.g., "ñ") to ISO-8859-1 (Latin1) H

Without going into too much detail, I have a C# WCF application that is a wrapper for an XML based API I am calling. That API returns a string, which is really just an XML document. I then parse that XML, and return it. That returned information is displayed in the browser as JSON.

A bit confusing, but here is some sampled code:

[OperationContract]
[WebInvoke(Method = "GET", BodyStyle = WebMessageBodyStyle.Bare,
    ResponseFormat = WebMessageFormat.Json, UriTemplate = "/TestGetUser")]
TestGetUserResponse TestGetUser();

/* ... */

[DataContract(Namespace = "http://schema.mytestdomain/", Name = "TestGetUser")]
public class TestGetUserResponse
{
    [DataMember]
    public User User { get; set; }
    [DataMember]
    public Error Error { get; set; }
}

And TestGetUser being:

public TestGetUserResponse TestGetUser() {
    WebClient client = getCredentials(); // getCredentials() method is defined elsewhere

    string apiUrl = "http://my.api.url.com/API";
    string apiRequest = "<?xml version='1.0' encoding='utf-8' ?><test>My XML Request Lives Here</test>";
    
    string result = client.UploadString(apiUrl, apiRequest);
    
    XmlDocument user = new XmlDocument();
    user.LoadXml(result);
    
    userNode = user.SelectSingleNode("/my[1]/xpath[1]/user[1]");
    
    return new TestGetUserResponse {
        Error = new Error(),
        User = new User {
            Name = userNode.SelectSingleNode("name[1]").InnerText,
            Email = userNode.SelectSingleNode("email[1]").InnerText,
            ID = System.Convert.ToInt32(userNode.SelectSingleNode("id[1]").InnerText)
        }
    };
}

So, when I hit my URL from a browser, it returns a JSON string, like below:

{
    "Error": {
        "ErrorCode": 0,
        "ErrorDetail": null,
        "ErrorMessage":"Success"
    },
    "User": {
        "Name": "John Smith",
        "Email": "john.smith@example.com",
        "ID": 12345
    }
}

Now, my problem is, sometimes the string that is returned (directly from the API) is a badly encoded UTF-8 string (I think? I could be getting this a bit wrong). For example, I may get back:

{
    "Error": {
        "ErrorCode": 0,
        "ErrorDetail": null,
        "ErrorMessage":"Success"
    },
    "User": {
        "Name": "Jose Nuñez",
        "Email": "jose.nunez@example.com",
        "ID": 54321
    }
}

Notice the ñ in the Name property under the User object.

My question is, how can I convert this improperly encoded string to a ñ , which is what it should be?

I've found a bunch of posts

But none seem to be exactly what I need, or trying to borrow from those posts have failed.

So, to make my question as simple as possible,

If I have a variable in a C# (.NET 3.5) application that when I write it out to the screen get's written as 'ñ', how can I "re-encode" (may be wrong word) so that it outputs as 'ñ'?

Thanks in advance.

Ideally this would be fixed in the api you are calling so it is returning the expected encoding. But you should be able to fix it this way:

byte[] bytes = Encoding.GetEncoding(1252).GetBytes(Name);
var nameFixed = Encoding.UTF8.GetString(bytes);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM