简体   繁体   English

如何在一行中读取具有多个分隔符的文件

[英]How to read a file with multiple delimiters within a single line

I am trying to read a file which has multiple delimiters per line. 我试图读取每行有多个分隔符的文件。 Below is the data 以下是数据

2,22:11
3,33:11
4,44:11
5,55:11
6,66:11
7,77:11
8,88:11
9,99:11
10,00:11
11,011:11
12,022:11
13,033:11
14,044:11
15,055:11
16,066:11
17,077:11
18,088:11
19,099:11

And the code is below 代码如下

Here, I am trying to read the line first with comma as delimiter to get line and then the colon. 在这里,我试图先用逗号作为分隔符来读取行,然后获取行,然后是冒号。

#include <fstream>
#include <iostream>
#include <string>

int main() {
  std::string line;
  std::string token;
  std::ifstream infile;

  infile.open("./data.txt");
  while (std::getline(infile, line,',')) {
    std::cout << line << std::endl;
    while(getline(infile, token, ':'))
    {
      std::cout << " : " << token;
    }
  }
}

But there is an issue with the code as it is skipping the first line. 但是代码存在问题,因为它正在跳过第一行。 Also , if i comment the second while loop,only the first line is getting printed, and below is the output 此外,如果我评论第二个while循环,只有第一行打印,下面是输出

Ian unable to figure out where exactly the code has gone wrong Ian无法弄清楚代码出错的地方

Output 产量

2
 : 22 : 11
3,33 : 11
4,44 : 11
5,55 : 11
6,66 : 11
7,77 : 11
8,88 : 11
9,99 : 11
10,00 : 11
11,011 : 11
12,022 : 11
13,033 : 11
14,044 : 11
15,055 : 11
16,066 : 11
17,077 : 11
18,088 : 11
19,099 : 11

Why two while ? 为什么是两个while

Your problem is that you repeat the second while forever. 你的问题是你永远重复第二次。 The first while is executed only to get the first 2 , the second while is executed to the end of the file. 执行第一个while只是为了获得前2 ,而第二个while执行到文件末尾。

You can do all with a single while ; 你可以做一个单一的while ; something like 就像是

#include <fstream>
#include <iostream>

using namespace std;

int main() {
 std::string line;
 std::string token;
 std::string num;
 ifstream infile;
 infile.open("a.txt");
  while (   getline(infile, line,',')
         && getline(infile, token, ':')
         && getline(infile, num) ) 
    cout << line << ',' << token << ':' << num << endl;
}

The problem comes from the fact you are using std::getline twice. 问题来自你使用std::getline两次的事实。

At the beginning you enter the first loop. 在开始时,您进入第一个循环。 The first call to std::getline returns want you expect : the first line until the , delimiter. 第一次调用std::getline返回想你所期待的:第一线,直到 ,分隔符。

Then you enter the second std::getline , in a nested loop, to get the rest of the line. 然后在嵌套循环中输入第二个std::getline ,以获取该行的其余部分。 But the thing is, you never leave the second loop until the end of the file. 但事实是,你永远不会离开第二个循环,直到文件结束。 So, you read all the file splitting by : delimiter. 因此,您通过以下方式读取所有文件拆分: delimiter。

When the second std:getline ends up to the end of the file, it leaves the nested loop. 当第二个std:getline结束到文件末尾时,它将离开嵌套循环。

Because you already read all the file, nothing's left to be read and the first loop directly exits. 因为您已经读取了所有文件,所以没有什么可以读取而且第一个循环直接退出。

Here is some debug to help you understand the context : 这是一些帮助您理解上下文的调试:

#include <fstream>
#include <iostream>
#include <string>

int main() {
  std::string line;
  std::string token;
  std::ifstream infile;

  infile.open("./data.txt");
  while (std::getline(infile, line, ',')) {
    std::cout << "First loop : " << line << std::endl;
    while(getline(infile, token, ':'))
    {
      std::cout << "Inner loop : " << token << std::endl;
    }
  }
}

The first lines to be printed are : 要打印的第一行是:

First loop : 2
Inner loop : 22
Inner loop : 11
3,33
Inner loop : 11
4,44

You can clearly see it doesn't exit the second loop until the end. 你可以清楚地看到它直到最后才退出第二个循环。

I would advise to read the entire line, with no care of delimiters, and then split the string into token using a tailor made function. 我建议阅读整行,不用分隔符, 然后使用量身定制的函数将字符串拆分为令牌。 It would be easy and very clean. 它会很容易也很干净。

Solution : 方案:

#include <fstream>
#include <list>
#include <iostream>
#include <string>

struct line_content {
  std::string line_number;
  std::string token;
  std::string value;
};

struct line_content tokenize_line(const std::string& line)
{
  line_content l;

  auto comma_pos = line.find(',');
  l.line_number = line.substr(0, comma_pos);

  auto point_pos = line.find(':');
  l.token = line.substr(comma_pos + 1, point_pos - comma_pos);

  l.value = line.substr(point_pos + 1);

  return l;
}

void print_tokens(const std::list<line_content>& tokens)
{
  for (const auto& line: tokens) {
    std::cout << "Line number : " << line.line_number << std::endl;
    std::cout << "Token : " << line.token << std::endl;
    std::cout << "Value : " << line.value << std::endl;
  }
}

int main() {
  std::string line;
  std::ifstream infile;

  std::list<line_content> tokens;

  infile.open("./data.txt");
  while (std::getline(infile, line)) {
    tokens.push_back(tokenize_line(line));
  }

  print_tokens(tokens);

  return 0;
}

I think you should be able to do what you what. 我想你应该能做你喜欢的事。

Compiled as follow : g++ -Wall -Wextra --std=c++1y <your c++ file> 编译如下: g++ -Wall -Wextra --std=c++1y <your c++ file>

If you want to split a string on multiple delimiters, without having to worry about the order of the delimiters, you can use std::string::find_first_of() 如果要在多个分隔符上拆分字符串,而不必担心分隔符的顺序,可以使用std::string::find_first_of()

#include <fstream>
#include <iostream>
#include <streambuf>
#include <string>

int
main()
{
        std::ifstream f("./data.txt");
        std::string fstring = std::string(std::istreambuf_iterator<char>(f),
                                          std::istreambuf_iterator<char>());
        std::size_t next, pos = 0;
        while((next = fstring.find_first_of(",:\n", pos)) != std::string::npos)
        {
                std::cout.write(&fstring[pos], ++next - pos);
                pos = next;
        }

        std::cout << &fstring[pos] << '\n';

        return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM