简体   繁体   English

它是什么 Unicode 字符(表情符号)?

[英]What unicode character (emoji) it was?

I have that string in my text file: ├░┬č┬Ź┬ć我的文本文件中有那个字符串: ├░┬č┬Ź┬ć

What is known is that it was emoji or at least some surrogate character/character created by javascript string of length 2 or 4已知的是它是表情符号或至少是由长度为 2 或 4 的 javascript 字符串创建的一些代理字符/字符

Because of some reason it end up in that form.由于某种原因,它最终以这种形式结束。 (It was obtained from mysql database which is utf8_general_ci and by node.js/mysql2/connection with charset latin1_swedish_ci ) (它是从 mysql 数据库utf8_general_ci和 node.js/mysql2/connection with charset latin1_swedish_ci

How can I find what emoji it was?我怎样才能找到它是什么表情符号? Is it possible?是否可以?

Other examples:其他例子:

├░┬č┬ĺ┬Ž ├░┬č┬ś┬ł ├░┬č┬ą┬Á ├░┬č┬ĺ┬Ž ├░┬č┬ś┬ł ├░┬č┬ą┬Á

Algorithm written in JS would be best option.用 JS 编写的算法将是最好的选择。

It's double mojibake as shown in the following python code snippet (sorry, I cannot give Javascript equivalent):它是mojibake ,如下面的python代码片段所示(抱歉,我不能给出等效的Javascript ):

print('🍆 💦 😈 🥵'.
      encode('utf-8').decode('latin1').  # 1st mojibake stage
      encode('utf-8').decode('cp852')    # 2nd mojibake stage
    )                                    # ├░┬č┬Ź┬ć ├░┬č┬ĺ┬Ž ├░┬č┬ś┬ł ├░┬č┬ą┬Á

Possible repair (although prevention is better than cure ):可能的修复(尽管预防胜于治疗):

print('├░┬č┬Ź┬ć ├░┬č┬ĺ┬Ž ├░┬č┬ś┬ł ├░┬č┬ą┬Á'.
      encode('cp852').decode('utf-8').       # fix 2nd mojibake stage
      encode('latin1').decode('utf-8')       # fix 1st mojibake stage
    )                                        # 🍆 💦 😈 🥵

FYI, those emojis are (column CodePoint contains Unicode ( U+hhhh ) and UTF-8 bytes; column Description contains surrogate pairs in parentheses):仅供参考,这些表情符号是(列CodePoint包含 Unicode ( U+hhhh ) 和 UTF-8 字节;列Description包含括号中的代理对):

Char CodePoint                      Description
---- ---------                      -----------
🍆   {U+1F346, 0xF0,0x9F,0x8D,0x86} AUBERGINE               (0xd83c,0xdf46)
💦   {U+1F4A6, 0xF0,0x9F,0x92,0xA6} SPLASHING SWEAT SYMBOL  (0xd83d,0xdca6)
😈   {U+1F608, 0xF0,0x9F,0x98,0x88} SMILING FACE WITH HORNS (0xd83d,0xde08)
🥵   {U+1F975, 0xF0,0x9F,0xA5,0xB5} OVERHEATED FACE         (0xd83e,0xdd75)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM