简体   繁体   English

std :: vector <string> 奇怪的行为

[英]std::vector<string> odd behavior

I have some weird issues I cannot figure out. 我有一些无法解决的怪异问题。 When I run the code below which takes a file.txt reads it line by line into a vector<string> and then compares each index to string "--" it does not make it to the comparison stage. 当我运行下面的代码时,该文件将使用file.txt逐行将其读取到vector<string> ,然后将每个索引与字符串"--"进行比较,但这不会使其进入比较阶段。

Further more, in the convert_file() under the for loop string m, has some weird behavior: string m = "1"; m+= "--"; 此外,在for循环字符串m下的convert_file()中,具有一些怪异的行为: string m = "1"; m+= "--"; string m = "1"; m+= "--"; ('--' inside vector) m+= "2"; ('-'在向量内部) m+= "2"; will print to console 2-- ; 将打印到控制台2-- ; which makes me think something is bugging out the vector. 这让我认为某些东西正在干扰向量。 The 2 is replacing the 1, the first character. 2正在替换第一个字符1。 This makes it look like the vector is bugged. 这使得矢量看起来像是臭虫。

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;

vector<string> get_file(const char* file){
      int SIZE=256, ln=0;
      char str[SIZE];
      vector<string> strs;
      ifstream in(file, ios::in);
      if(!in){
        return strs;
      } else {
        while(in.getline(str,SIZE)){
          strs.push_back(string(str));
          ln++;
        }
      }
      in.close();
      return strs;
    }

void convert_file(const char* file){
      vector<string> s = get_file(file);

      vector<string> d;
      int a, b;
      bool t = false;
      string comp = "--";

      for(int i=0; i<s.size(); i++){
        string m = "1";
        m+= string(s.at(i));
        m+= "2";
        cout << m << endl;
        if(s.at(i) == comp){
          cout << "s[i] == '--'" << endl;
        }
      }
    }

int main(){
  convert_file("test.txt");
  return 0;
}

now when I run a test file to check a similar program: 现在,当我运行测试文件以检查类似程序时:

#include <iostream>
#include <string>
#include <vector>
using namespace std;

int main(){
  vector<string> s;
  s.push_back("--");
  s.push_back("a");

  for(int i=0; i<s.size(); i++){
    cout << "1" << s.at(i) << "2" << endl;
    if(s.at(i) == "--"){
      cout << i << "= --" << endl;
    }
  }
  return 0;
}

prints off 1--2 , 0= -- , 1a2 . 打印出1--2 0= --1a2 it works, it prints properly, and does the comparison. 它可以正常工作,可以正确打印并进行比较。 This leads me to think something is happening when I pull the line into a string. 这使我认为将线拉成字符串时发生了某些事情。

Windows 7, cygwin64
g++ version 4.9.3
compile: D:\projects\test>g++ -o a -std=c++11 test.cpp

Based on the behavior and the discussion the lines in the file are terminated using a "\\r\\n" sequence. 根据行为和讨论,使用"\\r\\n"序列终止文件中的行。 The easiest approach for dealing with the remaining '\\r' is to remove it after reading a line. 处理其余'\\r'的最简单方法是在读取一行后将其删除。 For example: 例如:

for (std::string line; std::getline(file, line); ) {
    if (!line.empty() && line.back() == '\r') {
        line.resize(line.size() - 1u);
    }
    strs.push_back(line);
}

If you insist in reading into char arrays you can use file.gcount() to determine the number of characters read to find the end of the string quickly. 如果您坚持要读入char数组,则可以使用file.gcount()确定读取的字符数,以快速找到字符串的结尾。 Note, however, that the number includes the bewline character, ie, you'd want to check str[file.gcount() - 2] and potentially set it to '\\0' (if the count is bigger or equal to 2, of course). 但是请注意,该数字包含斜线字符,即,您需要检查str[file.gcount() - 2]并可能将其设置为'\\0' (如果计数大于或等于2,当然)。

As answered by Dietmar Kühl already, the problem is with the \\r\\n line endings. 正如DietmarKühl已经回答的那样,问题在于\\r\\n行尾。

However, you should not need to modify your source code. 但是,您不需要修改源代码。 The default behaviour in C++ is supposed to be to open files in text mode. C ++中的默认行为应该是在文本模式下打开文件。 Text mode means that whenever a line ending is found, where "line ending" depends on the platform you're using, it gets translated so that your program just sees a single \\n . 文本模式意味着只要找到行尾,“行尾”取决于您使用的平台,它就会被翻译,因此您的程序只会看到一个\\n You're supposed to explicitly request "binary mode" from your program to disable this line ending translation. 您应该从程序中明确请求“二进制模式”以禁用此行结束翻译。 This has been long-standing practise on Windows systems, is the behaviour well supported by the C++ standard, and is the expected behaviour with native Windows compilers, but for compatibility with POSIX and existing Unix programs that do not bother setting the file mode properly, Cygwin ignores this and defaults to opening files in binary mode unless a custom Cygwin-specific text mode is explicitly requested. 这是Windows系统上的长期实践,是C ++标准很好地支持的行为,也是本机Windows编译器的预期行为,但是为了与POSIX和现有的Unix程序兼容,它们不会费心设置文件模式, Cygwin会忽略此设置,并且默认情况下以二进制模式打开文件,除非明确请求了特定于Cygwin的自定义特定文本模式。

This is covered in the Cygwin FAQ . Cygwin常见问题解答对此进行了介绍。 The first solutions provided there (using O_TEXT or "t" , depending on how you open your file) are non-standard so break your code with other environments, and they are not as easy to use with C++ <fstream> file access. 那里提供的第一个解决方案(使用O_TEXT"t" ,取决于您打开文件的方式)是非标准的,因此会在其他环境中破坏您的代码,并且与C ++ <fstream>文件访问一样不容易使用。

However, the next solutions provided there do work even for C++ programs: 但是,那里提供的下一个解决方案甚至对C ++程序也有效:

You can also avoid to change the source code at all by linking an additional object file to your executable. 您还可以通过将其他目标文件链接到可执行文件来完全避免更改源代码。 Cygwin provides various object files in the /usr/lib directory which, when linked to an executable, changes the default open modes of any file opened within the executed process itself. Cygwin在/ usr / lib目录中提供了各种目标文件,当这些文件链接到可执行文件时,它们将更改在已执行进程本身中打开的任何文件的默认打开模式。 The files are 这些文件是

\nbinmode.o - Open all files in binary mode. binmode.o-以二进制模式打开所有文件。\ntextmode.o - Open all files in text mode. textmode.o-以文本模式打开所有文件。\ntextreadmode.o - Open all files opened for reading in text mode. textreadmode.o-打开所有以文本模式读取的文件。\nautomode.o - Open all files opened for reading in text mode, automode.o-以文本模式打开所有要阅读的文件,\n                 all files opened for writing in binary mode. 所有打开的文件以二进制模式写入。\n

And indeed, changing your compiler and linker invocation from g++ -oa -std=c++11 test.cpp to g++ -oa -std=c++11 test.cpp /usr/lib/textmode.o , your program works without changes to your source code. 确实,将您的编译器和链接器调用从g++ -oa -std=c++11 test.cppg++ -oa -std=c++11 test.cpp /usr/lib/textmode.o ,您的程序无需更改您的源代码。 Linking with textmode.o basically means that your I/O will work the way it already should work by default. 基本上,与textmode.o链接意味着您的I / O将按照默认情况下已经可以正常工作的方式工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM