简体   繁体   中英

JsonNode.Parse: error parsing text with accents

I am trying to parse a Latin text with the Parse method of JsonNode from System.Text.Json class.

But when the text contains accents, the method returns escape characters.

var jsonString = File.ReadAllText(path, Encoding.GetEncoding(1252));                   
var jTemplate = JsonNode.Parse(jsonString);

The object " jsonString " contain the right text (with accents) but when I call JsonNode.Parse the object "jTemplate" contains the bad text

"Ciberseguridad en la organización" in jsonString

"Ciberseguridad en la organizaci\ón " in jTemplate

I have also tried other encoding and code page, for example UTF8 with the same results...

¿Any idea how to parse text with accents?

Thanks in advance.

For the moment JsonNode.Parse() doesn't provide a way to set the Encoder similar to JsonSerializer .

You have two options:

  1. Use JsonSerializer instead and follow the tips from the link above.

  2. Unescape the string value after parsing it using the JsonNode :

     var expectedValue = Regex.Unescape(jTemplate["key"].ToString());

I can offer you to use JsonSerializer.Deserialize method which accept JsonSerializerOptions object where you can set Encoder.

The output of my code sample is:

Ciberseguridad en la organización

using System.Text.Encodings.Web;
using System.Text.Json;
using System.Text.Unicode;

string jsonString = "{\"data\": \"Ciberseguridad en la organización\"}";
JsonSerializerOptions options = new JsonSerializerOptions()
{
    Encoder = JavaScriptEncoder.Create(UnicodeRanges.All)
};
DataDto? jTemplate = JsonSerializer.Deserialize<DataDto>(jsonString, options);
Console.WriteLine(jTemplate.data);

class DataDto
{
    public string data { get; set; }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM