I'm reading a json file where some fields have string like the following: "Eduardo Fonseca Bola\Ã\±os comparti\Ã\³ una publicaci\Ã\³n."
The final end reslt should look like this "Eduardo Fonseca Bolaños compartió una publicación."
You can use Json.NET library to decode the string. The deserializer decodes the string automatically.
public class Example
{
public String Name { get; set; }
}
//
var i = @"{ ""Name"" : ""Eduardo Fonseca Bola\u00c3\u00b1os comparti\u00c3\u00b3 una publicaci\u00c3\u00b3n."" }";
var jsonConverter = Newtonsoft.Json.JsonConvert.DeserializeObject(i);
// Encode the string to UTF8
byte[] bytes = Encoding.Default.GetBytes(jsonConverter.ToString());
var myString = Encoding.UTF8.GetString(bytes);
Console.WriteLine(myString);
// Deserialize using class
var sample = Newtonsoft.Json.JsonConvert.DeserializeObject<Example>(i);
byte[] bytes = Encoding.Default.GetBytes(sample.Name);
var myString = Encoding.UTF8.GetString(bytes);
Console.WriteLine(myString);
The output is:
{
"Name": "Eduardo Fonseca Bolaños compartió una publicación."
}
You can use System.Web.Helpers.Json.Decode method. You won't need to use any external libraries.
Here is the fix for this specific situation
private static Regex _regex =
new Regex(@"(\\u(?<Value>[a-zA-Z0-9]{4}))+", RegexOptions.Compiled);
private static string ConvertUnicodeEscapeSequencetoUTF8Characters(string sourceContent)
{
//Check https://stackoverflow.com/questions/9738282/replace-unicode-escape-sequences-in-a-string
return _regex.Replace(
sourceContent, m =>
{
var urlEncoded = m.Groups[0].Value.Replace(@"\u00", "%");
var urlDecoded = System.Web.HttpUtility.UrlDecode(urlEncoded);
return urlDecoded;
}
);
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.