简体   繁体   English

如何将标准输入读入字符串变量,直到C中的EOF为止?

[英]How to read the standard input into string variable until EOF in C?

I am getting "Bus Error" trying to read stdin into a char* variable. 我收到“Bus Error”试图将stdin读入char*变量。 I just want to read whole stuff coming over stdin and put it first into a variable, then continue working on the variable. 我只想阅读stdin所有内容并将其放入变量中,然后继续处理变量。

My Code is as follows: 我的守则如下:

char* content;
char* c;
while( scanf( "%c", c)) {
 strcat( content, c);
}

fprintf( stdout, "Size: %d", strlen( content));

But somehow I always get "Bus error" returned by calling cat test.txt | myapp 但不知何故,我总是通过调用cat test.txt | myapp返回“总线错误” cat test.txt | myapp , where myapp is the compiled code above. cat test.txt | myapp ,其中myapp是上面编译的代码。

My question is how do i read stdin until EOF into a variable? 我的问题是我如何读取stdin直到EOF变成变量? As you see in the code, I just want to print the size of input coming over stdin, in this case it should be equal to the size of the file test.txt . 正如您在代码中看到的,我只想打印来自stdin的输入大小,在这种情况下,它应该等于文件test.txt的大小。

I thought just using scanf would be enough, maybe buffered way to read stdin ? 我以为只使用scanf就足够了,可能是缓冲方式来读取stdin

First, you're passing uninitialized pointers, which means scanf and strcat will write memory you don't own. 首先,你传递未初始化的指针,这意味着scanfstrcat将写入你不拥有的内存。 Second, strcat expects two null-terminated strings, while c is just a character. 其次, strcat需要两个以null结尾的字符串,而c只是一个字符。 This will again cause it to read memory you don't own. 这将再次使其读取您不拥有的内存。 You don't need scanf, because you're not doing any real processing. 您不需要scanf,因为您没有进行任何实际处理。 Finally, reading one character at a time is needlessly slow. 最后,一次读取一个字符是不必要的慢。 Here's the beginning of a solution, using a resizable buffer for the final string, and a fixed buffer for the fgets call 这是解决方案的开始,使用可调整大小的缓冲区作为最终字符串,并为fgets调用使用固定缓冲区

#define BUF_SIZE 1024
char buffer[BUF_SIZE];
size_t contentSize = 1; // includes NULL
/* Preallocate space.  We could just allocate one char here, 
but that wouldn't be efficient. */
char *content = malloc(sizeof(char) * BUF_SIZE);
if(content == NULL)
{
    perror("Failed to allocate content");
    exit(1);
}
content[0] = '\0'; // make null-terminated
while(fgets(buffer, BUF_SIZE, stdin))
{
    char *old = content;
    contentSize += strlen(buffer);
    content = realloc(content, contentSize);
    if(content == NULL)
    {
        perror("Failed to reallocate content");
        free(old);
        exit(2);
    }
    strcat(content, buffer);
}

if(ferror(stdin))
{
    free(content);
    perror("Error reading from stdin.");
    exit(3);
}

EDIT: As Wolfer alluded to, a NULL in your input will cause the string to be terminated prematurely when using fgets. 编辑:正如Wolfer所提到的,输入中的NULL将导致在使用fgets时提前终止字符串。 getline is a better choice if available, since it handles memory allocation and does not have issues with NUL input. 如果可用, getline是更好的选择,因为它处理内存分配并且没有NUL输入问题。

Your problem is that you've never allocated c and content , so they're not pointing anywhere defined -- they're likely pointing to some unallocated memory, or something that doesn't exist at all. 你的问题是你从未分配过ccontent ,所以它们没有指向任何定义的地方 - 它们可能指向一些未分配的内存,或根本不存在的东西。 And then you're putting data into them. 然后你将数据放入其中。 You need to allocate them first. 您需要先分配它们。 (That's what a bus error typically means; you've tried to do a memory access that's not valid.) (这就是总线错误通常意味着的;你试图进行无效的内存访问。)

(Alternately, since c is always holding just a single character, you can declare it as char c and pass &c to scanf. No need to declare a string of characters when one will do.) (或者,因为c总是只保存一个字符,所以你可以将它声明为char c并将&c传递给scanf。当需要时,不需要声明一串字符。)

Once you do that, you'll run into the issue of making sure that content is long enough to hold all the input. 一旦你这样做,你将遇到确保content足够长以容纳所有输入的问题。 Either you need to have a guess of how much input you expect and allocate it at least that long (and then error out if you exceed that), or you need a strategy to reallocate it in a larger size if it's not long enough. 您需要猜测您期望的输入量并至少分配那么长的时间(如果超过该值,则会出错),或者如果时间不够长,您需要一个策略来重新分配它。

Oh, and you'll also run into the problem that strcat expects a string, not a single character. 哦,你也会遇到strcat期望字符串而不是单个字符的问题。 Even if you leave c as a char* , the scanf call doesn't make it a string. 即使将c保留为char*scanf调用也不会使其成为字符串。 A single-character string is (in memory) a character followed by a null character to indicate the end of the string. 单字符字符串(在内存中)是一个字符,后跟一个空字符,表示字符串的结尾。 scanf , when scanning for a single character, isn't going to put in the null character after it. scanf ,当扫描单个字符时,不会在其后放入空字符。 As a result, strcpy isn't going to know where the end of the string is, and will go wandering off through memory looking for the null character. 因此, strcpy不会知道字符串结尾的位置,并且会在内存中寻找空字符。

Since you don't care about the actual content, why bother building a string? 既然你不关心实际的内容,为什么还要打扰一个字符串呢? I'd also use getchar() : 我也使用getchar()

int    c;
size_t s = 0;

while ((c = getchar()) != EOF)
{
  s++;
}

printf("Size: %z\n", s);

This code will correctly handle cases where your file has '\\0' characters in it. 此代码将正确处理文件中包含'\\0'字符的情况。

The problem here is that you are referencing a pointer variable that no memory allocated via malloc , hence the results would be undefined, and not alone that, by using strcat on a undefined pointer that could be pointing to anything, you ended up with a bus error! 这里的问题是你引用了一个没有通过malloc分配的内存的指针变量,因此结果将是未定义的,并且不是唯一的,通过在未定义的指针上使用strcat可以指向任何东西,你最终得到了一个总线错误!

This would be the fixed code required.... 这将是所需的固定代码....

char* content = malloc (100 * sizeof(char));
char c;
if (content != NULL){
   content[0] = '\0'; // Thanks David!
   while ((c = getchar()) != EOF)
   {
       if (strlen(content) < 100){
           strcat(content, c);
           content[strlen(content)-1] = '\0';
       }
   }
}
/* When done with the variable */
free(content);

The code highlights the programmer's responsibility to manage the memory - for every malloc there's a free if not, you have a memory leak! 代码突出了程序员管理内存的责任 - 对于每个malloc都有free如果没有,你有内存泄漏!

Edit: Thanks to David Gelhar for his point-out at my glitch! 编辑:感谢David Gelhar在我的故障中指出了他的观点! I have fixed up the code above to reflect the fixes...of course in a real-life situation, perhaps the fixed value of 100 could be changed to perhaps a #define to make it easy to expand the buffer by doubling over the amount of memory via realloc and trim it to size... 我已修复上面的代码以反映修复...当然在现实生活中,也许100的固定值可能会改为#define ,以便通过加倍量来扩展缓冲区内存通过realloc并修剪它的大小......

Assuming that you want to get (shorter than MAXL-1 chars) strings and not to process your file char by char, I did as follows: 假设您想获得(短于MAXL-1个字符串)字符串而不是通过char处理您的文件char,我执行如下操作:

#include <stdio.h>
#include <string.h>
#define MAXL 256

main(){
  char s[MAXL];
  s[0]=0;
  scanf("%s",s);
  while(strlen(s)>0){
    printf("Size of %s : %d\n",s,strlen(s));
    s[0]=0;
    scanf("%s",s);
  };
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM