简体   繁体   中英

How to change a strings encoding as utf 8 in C

How can i change character encoding of a string to UTF-8? I am making some execv calls to a python program but python returns the strings with the some characters cut of. I don't know if this a python issue or c issue but i thought if i can change the strings encoding in c and then pass it to python, it should do the trick. So how can i do that?

Thanks.

There is no such thing as character encoding in C.

A char* can hold any data, how you interpret the characters is up to you. For instance, printf will typically dump the characters as they are to the standard output, and if your console interprets those characters as UFT8, they'll appear as such.

If you want to convert between different encodings in the C side, you can have a look at ICU .

If you want to convert between encodings in the Python side, look at http://docs.python.org/howto/unicode.html .

C as a language does not facilitate string encoding. AC string is simply a null-terminated sequence of characters (8-bit signed integers, on most systems).

A wide string (with characters of type wchar_t , typically 16-bit integers) can also be used to hold larger character values; however, again, C standard library functions and data types are in no way aware of any concept of string encoding.

The answer to your question is to ensure that the strings you're passing into Python are encoded as UTF-8.

In order to help you accomplish that in any detailed capacity, however, you will have to provide more information about how your strings are currently formed, what they contain, and how you're constructing your argument list for exec.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM