简体   繁体   中英

How can I convert a escaped unicode to regular format unicode

I have this code to help parse the unicode for an emoji :

public string DecodeEncodedNonAsciiCharacters(string value)
{
    return Regex.Replace(
       value,
      @"\\u(?<Value>[a-zA-Z0-9]{4})",
       m =>
         ((char)int.Parse(m.Groups["Value"].Value, NumberStyles.HexNumber)).ToString();
    );
} 
   

so I put my code as such

DecodeEncodedNonAsciiCharacters("\uD83C\uDFCB\uD83C\uDFFF\u200D\u2642\uFE0F");

into Console.WriteLine(); which gives me this emoji 🏋🏿‍♂️ so my question is how can I turn this

"\uD83C\uDFCB\uD83C\uDFFF\u200D\u2642\uFE0F"

into this Codepoints

U+1F3CB, U+1F3FF, U+200D, U+2642, U+FE0F

the codepoints above are from Emojipedia.org

It seems, that you want to combine two surrogate characters into one Utf-32:

\uD83C\uDFCB => \U0001F3CB

If it's your case, you can put it like this:

Code:

public static IEnumerable<int> CombineSurrogates(string value) {
  if (null == value)
    yield break; // or throw new ArgumentNullException(name(value));

  for (int i = 0; i < value.Length; ++i) {
    char current = value[i];
    char next = i < value.Length - 1 ? value[i + 1] : '\0';

    if (char.IsSurrogatePair(current, next)) {
      yield return (char.ConvertToUtf32(current, next));

      i += 1;
    }
    else
      yield return (int)current;
  }
}

public static string DecodeEncodedNonAsciiCharacters(string value) =>
  string.Join(" ", CombineSurrogates(value).Select(code => $"U+{code:X4}"));

Demo:

string data = "\uD83C\uDFCB\uD83C\uDFFF\u200D\u2642\uFE0F";

// If you want codes, uncomment the line below
//int[] codes = CombineSurrogates().ToArray(data);

string result = DecodeEncodedNonAsciiCharacters(data);

Console.Write(result);

Outcome:

U+1F3CB U+1F3FF U+200D U+2642 U+FE0F

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM