简体   繁体   English

如果(c =='\\ n')处理跨平台文件,是否有必要编写“便携式”文件?

[英]Is it necessary to write a “portable” if (c == '\n') to process cross-platform files?

This thinking comes from a discussion about a practical problem Replacing multiple new lines in a file with just one . 这种想法来自对一个实际问题的讨论,该问题仅用一个文件替换文件中的多个新行 Something wrong happened while using a cygwin terminal running on a windows 8.1 machine. 使用在Windows 8.1机器上运行的cygwin终端时出了点问题。

Since the end-of-line terminator would be different, like \\n , \\r , or \\r\\n , is it necessary to write a "portable" if(c=='\\n') to make it work well on Linux, Windows and OS X? 由于行尾终止符可能不同,例如\\n\\r\\r\\n ,是否有必要编写一个“便携式” if(c=='\\n')以使其在Linux,Windows和OS X? Or, the best practise is just to convert the file with commands/tools? 或者, 最佳实践是仅使用命令/工具转换文件?

  #include <stdio.h>
    int main ()
    {
      FILE * pFile;
      int c;
      int n = 0;
      pFile=fopen ("myfile.txt","r");
      if (pFile==NULL) perror ("Error opening file");
      else
      {
        do {
          c = fgetc (pFile);
          if (c == '\n') n++; // will it work fine under different platform?
        } while (c != EOF);
        fclose (pFile);
        printf ("The file contains %d lines.\n",n);
      }
      return 0;
    }

Update1: 更新1:

CRT will always convert line endings into '\\n'? CRT总是将行尾转换为'\\ n'吗?

If an input file is opened in binary mode (the character 'b' in the mode string) then it is necessary to worry about the possible presence of '\\r' before '\\n' . 如果输入文件以二进制模式(模式字符串中的字符“ b”)打开,则有必要担心'\\n'之前可能存在'\\r' '\\n'

If the file is not opened in binary mode (and also not read using binary functions such as fread() ) then it is not necessary to worry about the presence of '\\r' before '\\n' because that will be handled before the input is received by your code - either by a relevant system function (eg device driver that reads input from disk, or from stdin ) or by the implementation of the functions you use to read input from the file. 如果未以二进制模式打开文件(也未使用fread()类的二进制函数读取文件),则不必担心在'\\n'之前存在'\\r'的情况,因为它将在输入由您的代码接收-通过相关的系统功能(例如,从磁盘或stdin读取输入的设备驱动程序),或通过实现从文件读取输入的功能的实现。

If you are transferring files between systems (eg writing the file under linux, and transferring it to a windows system, where a program tries to read it in) then you have options; 如果要在系统之间传输文件(例如,在linux下编写文件,然后将其传输到Windows系统(程序试图在其中读取文件)),则可以选择;

  • write and read the file in non-binary mode, and do a relevant translation of the file when transferring it between systems. 以非二进制模式写入和读取文件,并在系统之间传输文件时对文件进行相关翻译。 If using ftp this can be handled by transferring the file using text mode rather than binary mode. 如果使用ftp则可以通过使用文本模式而不是二进制模式传输文件来解决。 If the file is transferred in binary mode, the you will need to run the file through dos2unix (if transferring the file to unix) or through unix2dos (going the other way). 如果以二进制模式传输文件,则您将需要通过dos2unix (如果将文件传输到unix)或unix2dos (通过其他方式)运行文件。
  • Do all your I/O in binary mode, transfer them between systems using binary mode, and never read them in non-binary mode. 以二进制模式执行所有I / O,使用二进制模式在系统之间传输它们,而永远不要以非二进制模式读取它们。 Among other things, this gives you explicit control over what data is in the file. 除其他外,这使您可以明确控制文件中的数据。
  • Write your file in text mode, transfer the file as you see fit. 以文本模式编写文件,并根据需要传输文件。 Then only read in binary mode and, when your reading code encounters a \\r\\n pair, drop the '\\r' character. 然后仅以二进制模式读取,并且当您的读取代码遇到\\r\\n对时,请删除'\\r'字符。

The last is arguably the most robust - the writing code might include \\r before \\n characters, or it might not, but the reading code simply ignores any '\\r' characters that it encounters before a '\\n' character. 最后一个可以说是最强大的-编写代码可能在\\n字符之前包含\\r ,也可以不包含,但是阅读代码只是忽略了它在'\\n'字符之前遇到的所有'\\r' '\\n'字符。 Such code will probably even cope if the files are edited by hand (eg with a text editor - that might be separately configured to either insert or remove \\r and \\n ) before being read. 如果在读取文件之前手动编辑了文件(例如,使用文本编辑器-可以将其单独配置为插入或删除\\r\\n ),则此类代码甚至可以应付。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM