简体   繁体   中英

C++ get unicode character by it's utf-8 value

i am not good with c++ and i am trying to create function to convert URL encoded string into regular string.

But i am getting weird results, for example, %C4%93 (decimal 50323) should be utf-8 symbol ē , but i am getting ō when i print in console. I tried:

  • string+= static_cast(character_integer_value);
  • string+= (char)character_integer_value;
  • string+= character_integer_value;

but nothing of this gave me output i expected.

Can you please point me what i am doing wrong?

std::string myUrldecode(const std::string& original) {
    std::string s = original;
    std::string tmp0 = "";
    int tmp1 = 0;
    int tmp2 = 0;
    std::string decoded = "";

    for (string::size_type i = s.find("%");
        i != string::npos;
        i = s.find("%"))
    {
        if(i > 0){
            decoded+= tmp0;
            tmp0 = "";
            tmp2 = 0;
        }
        decoded+= s.substr(0, i);
        s.erase(0, i);

        tmp0+= s.substr(0, 2);
        tmp1 = strtol(s.substr(1, 2).c_str(), nullptr, 16);

        if(tmp1 >= 20 && tmp1 < 127){
            decoded+= static_cast<char>(tmp1);
            s.erase(0, 3);
            tmp0 = "";
        }
        else if(tmp1 >= 192 && tmp1 < 223){
            tmp2 = tmp1;
            s.erase(0, 3);
        }
        else if(tmp1 >= 128 && tmp1 <= 191 && tmp2 > 192){
            tmp1+= tmp2 * 256;
            decoded+= tmp1;
            s.erase(0, 3);
            tmp0 = "";
        }
        else{
            s.erase(0, 3);
        }
    }
    decoded+= tmp0;
    decoded+= s;
    return decoded;
}

I am using Dev-C++ 5.11 with GCC 4.9.2 32 bit to compile that code.

You have got it totally wrong.

"%C4%93" is the UTF-8 encoding for ē, so you just need to convert the numbers (C4+93) into char s. Instead you seem to be worrying about characters ranges 127-192 etc.

I think the code you have written maybe trying to convert a Unicode code point into UTF-8 (275 -> C493).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM