简体   繁体   English

从 c 中的 uint64_t 中获取字符

[英]get char out of uint64_t in c

so like title, im not sure how to get a char(in type "char", not just a byte but in the same type).For example, from a uint64_t?就像标题一样,我不确定如何获取字符(在“char”类型中,不仅仅是一个字节,而是相同类型)。例如,来自 uint64_t?

I guess a type cast wont work?我猜类型转换不起作用?

Thanks a lot!非常感谢!

The thing is, a char in C is only one byte, and therefore can mainly represent ASCII characters.问题是 C 中的一个char只有一个字节,因此主要可以表示 ASCII 字符。 If you character is unicode, it simply cannot be converted to char .如果您的角色是 unicode,则根本无法将其转换为char If you do want to be able to be able to store unicode characters, you should use some other type, such as wchar_t (unicode, compiler-dependant size) or char16_t (utf16 character, can still not represent some characters such as emojis and other 4-byte characters), or even char32_t .如果确实希望能够存储 unicode 字符,则应使用其他类型,例如wchar_t (unicode,编译器相关大小)或char16_t (utf16 字符,仍然不能表示某些字符,例如 emojis 和其他4 字节字符),甚至char32_t

Either way, a simple cast should work, so far as you either use ASCII or unicode.无论哪种方式,只要您使用 ASCII 或 unicode,一个简单的转换就可以工作。

Note: Either way, the compiler will warn you that you may lose data in the proses, as uint64_t can store more values that existing characters and therefore is larger than any character type.注意:无论哪种方式,编译器都会警告您可能会丢失散文中的数据,因为uint64_t可以存储比现有字符更多的值,因此比任何字符类型都大。

So if you want to get char value if your int64_t < 255 you can try casting it first to uint8_t like this:因此,如果您想在 int64_t < 255 的情况下获得 char 值,您可以尝试先将其转换为 uint8_t,如下所示:

printf("%c", (int8_t)var);

Else if you need every char in the uint64_t you can try:否则,如果您需要 uint64_t 中的每个字符,您可以尝试:

void int64ToChar(char mesg[], int64_t num) {
    for(int i = 0; i < 8; i++) mesg[i] = num >> (8-1-i)*8;
}

It depends on how the character — or characters — got into the uint64_t value in the first place.这首先取决于字符(或字符)如何进入uint64_t值。

If you say如果你说

uint64_t uu = 0x41;

then uu contains the ASCII value of a single character, and it's trivial to pull it back out.然后uu包含单个字符的 ASCII 值,将其拉回很简单。 You don't even need a cast:你甚至不需要演员表:

char c = uu;
printf("%c\n", c);      /* prints "A" */

Of course, since it's 64 bits wide, a uint64_t can theoretically have up to eight 8-bit ASCII characters jammed into it:当然,由于它是 64 位宽,一个uint64_t理论上最多可以有 8 个 8 位 ASCII 字符塞入其中:

uu = 0x48656c6c6f;      /* "Hello" in hex */

If so, you can extract individual characters using some bit manipulation:如果是这样,您可以使用一些位操作来提取单个字符:

c = (uu >> 24) & 0xff;
printf("%c\n", c);      /* prints "e" */

Finally, since uint64_t is wider than 8 bits, it can also contain Unicode characters.最后,由于uint64_t比 8 位宽,它还可以包含 Unicode 字符。 For example, I could write:例如,我可以写:

uu = 0x03A3;            /* U+03A3 Greek Capital Letter Sigma */

But now there's no way to extract that as a plain char , or print it using %c .但是现在没有办法将它提取为普通的char ,或者使用%c打印它。 I'd have to use a wchar_t , and %lc :我必须使用wchar_t%lc

wchar_t wc = uu;
printf("%lc\n", wc);    /* might print "Σ" */

Note that besides using wchar_t and %lc , this last works only if the output device is Unicode-capable, and if the "locale" is set up properly.请注意,除了使用wchar_t%lc之外,这仅在 output 设备支持 Unicode 并且“区域设置”设置正确时才有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM