(Only C) conversion special characters from string char to Hex. Unicode

Question

I'm trying to use this code: https://www.includehelp.com/c/convert-ascii-string-to-hexadecimal-string-in-c.aspx

This code works perfect on my program.

It converts from utf-8 to hexa unicode characters like A,m,n,d,0,9 perfectly.

Please, can anybody tell me or modify this program, when inside the string we have "special characters", like vocals with accents (ñ,ç,à,á,...). because, when I run this program don't works as I expected.

I'm working in a RHEL 7 with native C (sorry but I don't know the version)

The special characters that I'm trying to convert to hex Unicode are in UTF-8.

#include <stdio.h>
#include <string.h>

//function to convert ascii char[] to hex-string (char[])
void string2hexString(char* input, char* output)
{
    int loop;
    int i; 

    i=0;
    loop=0;

    while(input[loop] != '\0')
    {
        sprintf((char*)(output+i),"%02X", input[loop]);
        loop+=1;
        i+=2;
    }
    //insert NULL at the end of the output string
    output[i++] = '\0';
}

int main(){
    char ascii_str[] = "Hello world!";
    //declare output string with double size of input string
    //because each character of input string will be converted
    //in 2 bytes
    int len = strlen(ascii_str);
    char hex_str[(len*2)+1];

    //converting ascii string to hex string
    string2hexString(ascii_str, hex_str);

    printf("ascii_str: %s\n", ascii_str);
    printf("hex_str: %s\n", hex_str);

    return 0;
}

Output

ascii_str: Hello world!

hex_str: 48656C6C6F20776F726C6421

I would like entry "ascii_str" like "áéíóúàèìòùç" and be able to obtain this hex UNIcodes on a string as this codes: ( unicode )

letra-> á // cod.hex--> e1
letra-> é // cod.hex--> e9
letra-> í // cod.hex--> ed
letra-> ó // cod.hex--> f3
letra-> ú // cod.hex--> fa
letra-> à // cod.hex--> e0
letra-> è // cod.hex--> e8
letra-> ì // cod.hex--> ec
letra-> ò // cod.hex--> f2
letra-> ù // cod.hex--> f9
letra-> ç // cod.hex--> e7

Answer 1

Please read about utf-8 encoding. Utf-8 encoding has variable length for code point(chars). Symbols you are converting occupy two bytes, not one. Thus, code you are using is not suitable for this purpose.

(Only C) conversion special characters from string char to Hex. Unicode

Question

1 answers

solution1
0 2019-11-24 16:44:52

(Only C) conversion special characters from string char to Hex. Unicode

Question

1 answers

solution1 0 2019-11-24 16:44:52

solution1
0 2019-11-24 16:44:52