简体   繁体   English

从文件或标准输入读取

[英]Read from file or stdin

I am writing a utility which accepts either a filename, or reads from stdin. 我正在编写一个实用程序,它接受文件名或从stdin读取。

I would like to know the most robust / fastest way of checking to see if stdin exists (data is being piped to the program) and if so reading that data in. If it doesn't exist, the processing will take place on the filename given. 我想知道检查stdin是否存在的最强大/最快的方式(数据是否通过管道传输到程序),如果是,则读取该数据。如果不存在,则处理将在文件名上进行给出。 I have tried using the following the test for size of stdin but I believe since it's a stream and not an actual file, it's not working as I suspected it would and it's always printing -1 . 我已经尝试使用以下测试stdin大小,但我相信因为它是一个流而不是一个实际的文件,它不起作用,因为我怀疑它会,它总是打印-1 I know I could always read the input 1 character at a time while != EOF but I would like a more generic solution so I could end up with either a fd or a FILE* if stdin exists so the rest of the program will function seamlessly. 我知道我总是可以一次读取输入1个字符!= EOF但是我想要一个更通用的解决方案,所以如果stdin存在我最终会得到fd或FILE *所以程序的其余部分将无缝运行。 I would also like to be able to know its size, pending the stream has been closed by the previous program. 我希望能够知道它的大小,等待前一个程序关闭流。

long getSizeOfInput(FILE *input){
  long retvalue = 0;
  fseek(input, 0L, SEEK_END);
  retvalue = ftell(input);
  fseek(input, 0L, SEEK_SET);
  return retvalue;
}

int main(int argc, char **argv) {
  printf("Size of stdin: %ld\n", getSizeOfInput(stdin));
  exit(0);
}

Terminal: 终奌站:

$ echo "hi!" | myprog
Size of stdin: -1

You're thinking it wrong. 你认为这是错的。

What you are trying to do: 你想做什么:

If stdin exists use it, else check whether the user supplied a filename. 如果stdin存在使用它,否则检查用户是否提供了文件名。

What you should be doing instead: 你应该做什么代替:

If the user supplies a filename, then use the filename. 如果用户提供文件名,则使用文件名。 Else use stdin. 否则使用标准输入。

You cannot know the total length of an incoming stream unless you read it all and keep it buffered. 您无法知道传入流的总长度,除非您全部读取并保持缓冲。 You just cannot seek backwards into pipes. 你只是不能向后寻找管道。 This is a limitation of how pipes work. 这是管道工作方式的限制。 Pipes are not suitable for all tasks and sometimes intermediate files are required. 管道不适合所有任务,有时需要中间文件。

First, ask the program to tell you what is wrong by checking the errno , which is set on failure, such as during fseek or ftell . 首先,要求程序通过检查errno设置的errno来告诉你出了什么问题,例如在fseekftell期间。

Others (tonio & LatinSuD) have explained the mistake with handling stdin versus checking for a filename. 其他人(tonio和LatinSuD)解释了处理stdin与检查文件名时的错误。 Namely, first check argc (argument count) to see if there are any command line parameters specified if (argc > 1) , treating - as a special case meaning stdin . 即,首先检查argc (参数计数)以查看是否存在指定的任何命令行参数if (argc > 1) ,处理-作为特殊情况表示stdin

If no parameters are specified, then assume input is (going) to come from stdin , which is a stream not file, and the fseek function fails on it. 如果没有指定参数,则假设输入来自stdin ,这是一个非文件 ,并且fseek函数在其上失败。

In the case of a stream, where you cannot use file-on-disk oriented library functions (ie fseek and ftell ), you simply have to count the number of bytes read (including trailing newline characters) until receiving EOF (end-of-file). 对于流,您不能使用面向磁盘文件的库函数(即fseekftell ),您只需计算读取的字节数(包括尾随换行符),直到接收到EOF (结束时)文件)。

For usage with large files you could speed it up by using fgets to a char array for more efficient reading of the bytes in a (text) file. 对于大文件的使用,您可以通过将fgets用于char数组来加快速度,以便更有效地读取(文本)文件中的字节。 For a binary file you need to use fopen(const char* filename, "rb") and use fread instead of fgetc/fgets . 对于二进制文件,您需要使用fopen(const char* filename, "rb")并使用fread而不是fgetc/fgets

You could also check the for feof(stdin) / ferror(stdin) when using the byte-counting method to detect any errors when reading from a stream. 您还可以在使用字节计数方法检查从流中读取时的任何错误时检查for feof(stdin) / ferror(stdin)

The sample below should be C99 compliant and portable. 以下示例应符合C99和便携性。

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>

long getSizeOfInput(FILE *input){
   long retvalue = 0;
   int c;

   if (input != stdin) {
      if (-1 == fseek(input, 0L, SEEK_END)) {
         fprintf(stderr, "Error seek end: %s\n", strerror(errno));
         exit(EXIT_FAILURE);
      }
      if (-1 == (retvalue = ftell(input))) {
         fprintf(stderr, "ftell failed: %s\n", strerror(errno));
         exit(EXIT_FAILURE);
      }
      if (-1 == fseek(input, 0L, SEEK_SET)) {
         fprintf(stderr, "Error seek start: %s\n", strerror(errno));
         exit(EXIT_FAILURE);
      }
   } else {
      /* for stdin, we need to read in the entire stream until EOF */
      while (EOF != (c = fgetc(input))) {
         retvalue++;
      }
   }

   return retvalue;
}

int main(int argc, char **argv) {
   FILE *input;

   if (argc > 1) {
      if(!strcmp(argv[1],"-")) {
         input = stdin;
      } else {
         input = fopen(argv[1],"r");
         if (NULL == input) {
            fprintf(stderr, "Unable to open '%s': %s\n",
                  argv[1], strerror(errno));
            exit(EXIT_FAILURE);
         }
      }
   } else {
      input = stdin;
   }

   printf("Size of file: %ld\n", getSizeOfInput(input));

   return EXIT_SUCCESS;
}

You may want to look at how this is done in the cat utility, for example. 例如,您可能想要了解如何在cat实用程序中完成此操作。

See code here . 请参阅此处的代码 If there is no filename as argument, or it is "-", then stdin is used for input. 如果没有文件名作为参数,或者它是“ - ”,则stdin用于输入。 stdin will be there, even if no data is pushed to it (but then, your read call may wait forever). stdin将在那里,即使没有数据被推送到它(但是,你的读取呼叫可能永远等待)。

You can just read from stdin unless the user supply a filename ? 除非用户提供文件名,否则您只能从stdin读取?

If not, treat the special "filename" - as meaning "read from stdin". 如果没有,把特殊的“文件名” -意思是:“从标准输入读取”。 The user would have to start the program like cat file | myprogram - 用户必须像cat file | myprogram -一样启动程序 cat file | myprogram - if he wants to pipe data to it, and myprogam file if he wants it to read from a file. cat file | myprogram -如果他想管道数据, myprogam file如果他想要从文件读取。

int main(int argc,char *argv[] ) {
  FILE *input;
  if(argc != 2) {
     usage();
     return 1;
   }
   if(!strcmp(argv[1],"-")) {
     input = stdin;
    } else {
      input = fopen(argv[1],"rb");
      //check for errors
    }

If you're on *nix, you can check whether stdin is a fifo: 如果您使用* nix,则可以检查stdin是否为fifo:

 struct stat st_info;
 if(fstat(0,&st_info) != 0)
   //error
  }
  if(S_ISFIFO(st_info.st_mode)) {
     //stdin is a pipe
  }

Though that won't handle the user doing myprogram <file 虽然这不会处理用户做myprogram <file

You can also check if stdin is a terminal/console 您还可以检查stdin是否是终端/控制台

if(isatty(0)) {
  //stdin is a terminal
}

Note that what you want is to know if stdin is connected to a terminal or not, not if it exists. 请注意,您想要知道stdin是否连接到终端,而不是它是否存在。 It always exists but when you use the shell to pipe something into it or read a file, it is not connected to a terminal. 它总是存在但是当你使用shell管道内容或读取文件时,它没有连接到终端。

You can check that a file descriptor is connected to a terminal via the termios.h functions: 您可以通过termios.h函数检查文件描述符是否已连接到终端:

#include <termios.h>
#include <stdbool.h>

bool stdin_is_a_pipe(void)
{
    struct termios t;
    return (tcgetattr(STDIN_FILENO, &t) < 0);
}

This will try to fetch the terminal attributes of stdin. 这将尝试获取stdin的终端属性。 If it is not connected to a pipe, it is attached to a tty and the tcgetattr function call will succeed. 如果它没有连接到管道,它将附加到tty并且tcgetattr函数调用将成功。 In order to detect a pipe, we check for tcgetattr failure. 为了检测管道,我们检查tcgetattr故障。

我想,只是用feof测试文件结尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM