简体   繁体   English

如果我在 scanf 函数中使用带有字符串的“&”会发生什么?

[英]What happens if I use "&" with string in scanf function?

I just saw some code in a blog.我刚刚在博客中看到了一些代码。 It used它使用

scanf("%s",&T);

but as we know, we shouldn't use ampersand with a string because it automatically assigns the first address of that string.但正如我们所知,我们不应该在字符串中使用&符号,因为它会自动分配该字符串的第一个地址。 I did run that code, and surprisingly it is working, so I want to know what happens when I use & in string?我确实运行了该代码,令人惊讶的是它正在运行,所以我想知道当我在字符串中使用&时会发生什么?

#include <stdio.h>
int main()
{
    char T[2];
    scanf("%s", &T);
    printf("You entered %s\n", T);
}

Technically speaking, this is a type mismatch, leading to undefined behavior .从技术上讲,这是一种类型不匹配,导致未定义行为 For scanning a string , the expected argument is a pointer to the initial element of a character array.对于扫描string ,预期参数是指向字符数组初始元素的指针。

When you have an array t of type char[somevalue] , when you say当你有一个char[somevalue]类型的数组t时,当你说

scanf("%s",t);

t decays to a pointer to the first element, so that is OK. t衰减到指向第一个元素的指针,所以没问题。

On the other hand, when you say &t , it is of type char (*)[somevalue] - pointer to an array, the whole array, not the pointer to the the initial element of the array.另一方面,当你说&t ,它是char (*)[somevalue] - 指向数组的指针,整个数组,而不是指向数组初始元素的指针。

Now, since the address of the array and the address of the first element of the array are same (memory location), so, writing the scanned value to the supplied address may not lead to any problem and work as intended - but this is neither defined nor recommended.现在,由于数组的地址和数组的第一个元素的地址相同(内存位置),因此,将扫描值写入提供的地址可能不会导致任何问题并按预期工作 - 但这两者都不是定义或推荐。

The relevant part of the code snippet is:代码片段的相关部分是:

char T[2];
scanf("%s", &T);

&T is a pointer to the array of two characters ( char (*)[2] ). &T是一个指向两个字符数组的指针( char (*)[2] )。 This is not the type that scanf needs for a %s specifier: it needs a pointer to a character ( char * ).这不是scanf需要用于%s说明符的类型:它需要一个指向字符( char * )的指针。 So the behavior of the program is undefined.所以程序的行为是未定义的。

The correct way to write this program, as you know, is如您所知,编写此程序的正确方法是

char T[2];
scanf("%s", T);

Since T is an array, when it is used in most contexts, it “decays” to a pointer to the first character: T is equivalent to &(T[0]) which has the type char * .由于T是一个数组,当它在大多数上下文中使用时,它会“衰减”到指向第一个字符的指针: T等价于&(T[0]) ,其类型为char * This decay does not happen when you take the address of the array ( &T ) or its size ( sizeof(T) ).当您获取数组的地址 ( &T ) 或其大小 ( sizeof(T) ) 时,不会发生这种衰减。

In practice, almost all platforms use the same representation for all pointers to the same address.实际上,几乎所有平台都对指向同一地址的所有指针使用相同的表示。 So the compiler generates exactly the same code for T and &T .所以编译器为T&T生成完全相同的代码。 There are some rare platforms that may generate different code (I've heard of them but I couldn't name one).有一些罕见的平台可能会生成不同的代码(我听说过它们,但我无法命名)。 Some platforms use different encodings for “byte pointers” and “word pointers”, because their processor natively addresses words, not bytes.一些平台对“字节指针”和“字指针”使用不同的编码,因为它们的处理器本机寻址的是字,而不是字节。 On such platforms, an int * and a char * that point to the same address have different encodings.在此类平台上,指向同一地址的int *char *具有不同的编码。 A cast between those types converts the value, but misuse in something like a variable argument list would result in the wrong address.这些类型之间的转换会转换值,但在诸如可变参数列表之类的东西中误用会导致错误的地址。 I would expect such platforms to use byte addresses for a char array, however.但是,我希望这样的平台对字符数组使用字节地址。 There are also rare platforms where a pointer encodes not only the address of the data, but also some type or size information.还有一些罕见的平台,其中指针不仅编码数据的地址,还编码一些类型或大小信息。 However, on such platforms, the type and size information would have to be equivalent: it's a block of 2 bytes, starting at the address of T , and addressable byte by byte.然而,在这样的平台上,类型和大小信息必须是等效的:它是一个 2 字节的块,从T的地址开始,可逐字节寻址。 So this particular mistake is unlikely to have any practical impact.所以这个特定的错误不太可能产生任何实际影响。

Note that it would be completely different if you had a pointer instead of an array in the first place:请注意,如果您首先使用指针而不是数组,情况将完全不同:

char *T; // known to point to an array of two characters
scanf("%s", &T); // bad

Here &T is a pointer to the location in memory that contains the address of the character array.这里&T是指向内存中包含字符数组地址的位置的指针。 So scanf would write the characters that it reads at the location where the pointer T is stored in memory, not at the location that T points to.所以scanf会将它读取的字符写在指针T存储在内存中的位置,而不是T指向的位置。 Most compilers analyze the format string of functions like printf and scanf and so would emit an error message.大多数编译器会分析printfscanfprintf的格式字符串,因此会发出错误消息。

Note that char T[2] only has room for two characters, and this includes the null byte at the end of the string.请注意, char T[2]只有两个字符的空间,这包括字符串末尾的空字节。 So scanf("%s", T) only has room to read a single character.所以scanf("%s", T)只能读取单个字符。 If the input contains more than one non-whitespace character at this point, the program will overflow the buffer.如果此时输入包含多个非空白字符,程序将溢出缓冲区。 To read a single character and make it a one-character string, use要读取单个字符并使其成为单字符字符串,请使用

char T[2];
scanf("%c", T);
T[1] = 0;

Unlike scanf("%s", T) , this reads any character, even whitespace.scanf("%s", T) ,它读取任何字符,甚至是空格。 To read a string with a length limit, add a limit to the %s specification.要读取具有长度限制的字符串,请向%s规范添加限制。 You should never use an unlimited %s in scanf since this will read as much input as is available, regardless of how much room there is to store this input in memory.你永远不应该在scanf使用无限的%s ,因为这将读取尽可能多的输入,不管有多少空间可以在内存中存储这个输入。

char T[2];
scanf("%1s", T); // one less than the array size

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM