简体   繁体   English

在 C 中处理字符和整数时的神秘 ASCII 值

[英]Mysterious ASCII value when working with chars and integers in C

The idea that in C a char looks up a value in ASCII (but doesn't become an integer) makes sense.在 C 中 char 查找 ASCII 中的值(但不会变成整数)的想法是有道理的。

I wrote some code to illustrate this point, in which an integer value above 256 (256 possible values and the total number of ASCII items) wraps around to the beginning of ASCII, or 0. Interesting to me was that arithmetic can be performed when starting with an integer d and adding an integer to the character c.我写了一些代码来说明这一点,其中一个大于 256(256 个可能的值和 ASCII 项的总数)的整数值环绕到 ASCII 的开头,或 0。对我来说有趣的是可以在开始时执行算术用一个整数 d 加上一个整数到字符 c 上。

// C starts as a character
char c = 'c';
printf("c equals %i\n", c);
printf("c in ascii: %c\n", c);
printf("\n");

// I starts as an integer
int i = 105;
printf("i equals %i\n", i);
printf("i in ascii: %c\n", i);
printf("\n");

// Using arithmetic on character 'c'
int d = c + 1;
printf("d equals %i\n", d);
printf("d in ascii: %c\n", d);
printf("\n");

// The value of a in ascii (97) + the number of ascii characters (256)
int a = 353;
printf("a equals %i\n", a);
printf("a in ascii: %c\n", a);

Output:
c equals 99
c in ascii: c

i equals 105
i in ascii: i

d equals 100
d in ascii: d

a equals 353
a in ascii: a

However, I encountered a mystery when starting with a char d and adding an integer to another char c.然而,当我从一个字符 d 开始,向另一个字符 c 添加一个整数时,我遇到了一个谜。

// This makes sense...
char c = 'c';
int z = c + 100;
// But I would expect d to equal 199 as for z
char d = c + 100;

printf("c equals %i\n", c);
printf("z equals: %i\n", z);
printf("d equals %i\n", d);
printf("d equals %c\n", d);

Output:
c equals 99
z equals: 199
d equals -57
d equals 

d mysteriously becomes -57 and returns blank space when called as a char. d 神秘地变为 -57 并在作为字符调用时返回空格。 A debugger shows me that d has an ASCII value of '\\307', which I can't explain.调试器显示 d 的 ASCII 值为 '\\307',我无法解释。

The idea that in C a char looks up a value in ASCII (but doesn't become an integer) makes sense.在 C 中 char 查找 ASCII 中的值(但不会变成整数)的想法是有道理的。

In C, a character constant is an integer and has type int .在 C 中,字符常量是一个整数,类型为int

C implementations do not necessarily use ASCII. C 实现不一定使用 ASCII。 The compiler generally does not have to look it up, because it receives the source code already encoded as bytes in a file or stream.编译器通常不必查找它,因为它接收已在文件或流中编码为字节的源代码。 It may have to do some translation between different character encodings, such as between ASCII and UTF-8.它可能需要在不同的字符编码之间进行一些转换,例如 ASCII 和 UTF-8 之间的转换。

I wrote some code to illustrate this point, in which an integer value above 256 (256 possible values and the total number of ASCII items) wraps around to the beginning of ASCII, or 0.我写了一些代码来说明这一点,其中大于 256(256 个可能的值和 ASCII 项的总数)的整数值环绕到 ASCII 的开头,或 0。

You should not rely on this behavior without understanding it.你不应该在不了解它的情况下依赖这种行为。 It may not always happen that way.它可能并不总是那样发生。

 int a = 353; int a = 353;
printf("a equals %i\\n", a); printf("a 等于 %i\\n", a);
printf("a in ascii: %c\\n", a); printf("a in ascii: %c\\n", a);

When the %c conversion is used, the value passed for it is converted to an unsigned char , per C 2018 7.21.6.1 8. In common C implementation, unsigned char is eight bits.当使用%c转换时,传递给它的值将转换为unsigned char ,根据 C 2018 7.21.6.1 8. 在常见的 C 实现中, unsigned char是八位。 Per C 2018 6.3.1.3 2, this conversion works modulo 256;根据 C 2018 6.3.1.3 2,此转换以 256 为模; it wraps as you described.它按照您的描述包装。 This the character with code 353−256 = 97 is printed.这将打印代码为 353−256 = 97 的字符。 This has nothing to do with ASCII;这与 ASCII 无关; it is a result of unsigned char using eight bits.它是使用八位unsigned char的结果。 If the C implementation uses ASCII, then the value of 97 will cause an “a” to be printed.如果 C 实现使用 ASCII,则 97 的值将导致打印“a”。

 char c = 'c';字符 c = 'c';
int z = c + 100; int z = c + 100;
char d = c + 100;字符 d = c + 100;

printf("d equals %i\\n", d); printf("d 等于 %i\\n", d);
printf("d equals %c\\n", d); printf("d 等于 %c\\n", d);

In char d = c + 100;char d = c + 100; , the arithmetic is performed using int types. ,算术是使用int类型执行的。 This is because 100 is an int constant, and the operands of + are converted to have a common type.这是因为100是一个int常量,并且+的操作数被转换为具有公共类型。 (There are some complicated rules for this.) Given that the character 'c' has the value 99, so the variable c is 99, c + 100 yields 199. (对此有一些复杂的规则。)鉴于字符'c'的值为 99,因此变量c为 99, c + 100产生 199。

Then the char d is initialized with 199. The C standard permits char to be signed or unsigned.然后用 199 初始化char d d。C 标准允许char有符号或无符号。 It appears in your implementation that char is signed and eight bits, with values ranging from −128 to +127.在您的实现中, char是有符号的八位,其值范围从 -128 到 +127。 So 199 cannot be represented in a char .所以 199 不能用char表示。 Then the rules in C 2018 6.3.1.3 3 say that 199 is converted to an implementation-defined value or produces a signal.然后 C 2018 6.3.1.3 3 中的规则说 199 转换为实现定义的值或产生信号。

It appears your implementation wraps this value modulo 256. So the result is 199−256 = −57, which is representable in a char , so d is initialized to 57.看起来您的实现包装了这个值模 256。所以结果是 199−256 = −57,它可以用char表示,所以d被初始化为 57。

Then, when you print this with %i , “−57” is printed.然后,当您使用%i打印时,会打印“-57”。

When you print it with “%c”, it is converted to an unsigned char , as described above.当您使用“%c”打印它时,它会被转换为unsigned char ,如上所述。 This yields −57+256 = 199. This is not a code for an ASCII character, so your C implementation prints whatever character it has for value 199. That could appear as a blank space.这产生 -57+256 = 199。这不是 ASCII 字符的代码,因此您的 C 实现打印它具有值 199 的任何字符。这可能显示为空格。

A debugger shows me that d has an ASCII value of '\\307',…调试器显示 d 的 ASCII 值为 '\\307',...

\\nnn is a common way of writing characters using octal. \\nnn是使用八进制写字符的常用方法。 \\307 means 307 8 = 3•8 2 + 0•8 1 + 7•8 0 = 3•64 + 0•8 + 7•1 = 192 + 0 + 7 = 199. \\307表示 307 8 = 3•8 2 + 0•8 1 + 7•8 0 = 3•64 + 0•8 + 7•1 = 192 + 0 + 7 = 199。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM