简体   繁体   中英

How to read a file with multiple delimiters within a single line

I am trying to read a file which has multiple delimiters per line. Below is the data

2,22:11
3,33:11
4,44:11
5,55:11
6,66:11
7,77:11
8,88:11
9,99:11
10,00:11
11,011:11
12,022:11
13,033:11
14,044:11
15,055:11
16,066:11
17,077:11
18,088:11
19,099:11

And the code is below

Here, I am trying to read the line first with comma as delimiter to get line and then the colon.

#include <fstream>
#include <iostream>
#include <string>

int main() {
  std::string line;
  std::string token;
  std::ifstream infile;

  infile.open("./data.txt");
  while (std::getline(infile, line,',')) {
    std::cout << line << std::endl;
    while(getline(infile, token, ':'))
    {
      std::cout << " : " << token;
    }
  }
}

But there is an issue with the code as it is skipping the first line. Also , if i comment the second while loop,only the first line is getting printed, and below is the output

Ian unable to figure out where exactly the code has gone wrong

Output

2
 : 22 : 11
3,33 : 11
4,44 : 11
5,55 : 11
6,66 : 11
7,77 : 11
8,88 : 11
9,99 : 11
10,00 : 11
11,011 : 11
12,022 : 11
13,033 : 11
14,044 : 11
15,055 : 11
16,066 : 11
17,077 : 11
18,088 : 11
19,099 : 11

Why two while ?

Your problem is that you repeat the second while forever. The first while is executed only to get the first 2 , the second while is executed to the end of the file.

You can do all with a single while ; something like

#include <fstream>
#include <iostream>

using namespace std;

int main() {
 std::string line;
 std::string token;
 std::string num;
 ifstream infile;
 infile.open("a.txt");
  while (   getline(infile, line,',')
         && getline(infile, token, ':')
         && getline(infile, num) ) 
    cout << line << ',' << token << ':' << num << endl;
}

The problem comes from the fact you are using std::getline twice.

At the beginning you enter the first loop. The first call to std::getline returns want you expect : the first line until the , delimiter.

Then you enter the second std::getline , in a nested loop, to get the rest of the line. But the thing is, you never leave the second loop until the end of the file. So, you read all the file splitting by : delimiter.

When the second std:getline ends up to the end of the file, it leaves the nested loop.

Because you already read all the file, nothing's left to be read and the first loop directly exits.

Here is some debug to help you understand the context :

#include <fstream>
#include <iostream>
#include <string>

int main() {
  std::string line;
  std::string token;
  std::ifstream infile;

  infile.open("./data.txt");
  while (std::getline(infile, line, ',')) {
    std::cout << "First loop : " << line << std::endl;
    while(getline(infile, token, ':'))
    {
      std::cout << "Inner loop : " << token << std::endl;
    }
  }
}

The first lines to be printed are :

First loop : 2
Inner loop : 22
Inner loop : 11
3,33
Inner loop : 11
4,44

You can clearly see it doesn't exit the second loop until the end.

I would advise to read the entire line, with no care of delimiters, and then split the string into token using a tailor made function. It would be easy and very clean.

Solution :

#include <fstream>
#include <list>
#include <iostream>
#include <string>

struct line_content {
  std::string line_number;
  std::string token;
  std::string value;
};

struct line_content tokenize_line(const std::string& line)
{
  line_content l;

  auto comma_pos = line.find(',');
  l.line_number = line.substr(0, comma_pos);

  auto point_pos = line.find(':');
  l.token = line.substr(comma_pos + 1, point_pos - comma_pos);

  l.value = line.substr(point_pos + 1);

  return l;
}

void print_tokens(const std::list<line_content>& tokens)
{
  for (const auto& line: tokens) {
    std::cout << "Line number : " << line.line_number << std::endl;
    std::cout << "Token : " << line.token << std::endl;
    std::cout << "Value : " << line.value << std::endl;
  }
}

int main() {
  std::string line;
  std::ifstream infile;

  std::list<line_content> tokens;

  infile.open("./data.txt");
  while (std::getline(infile, line)) {
    tokens.push_back(tokenize_line(line));
  }

  print_tokens(tokens);

  return 0;
}

I think you should be able to do what you what.

Compiled as follow : g++ -Wall -Wextra --std=c++1y <your c++ file>

If you want to split a string on multiple delimiters, without having to worry about the order of the delimiters, you can use std::string::find_first_of()

#include <fstream>
#include <iostream>
#include <streambuf>
#include <string>

int
main()
{
        std::ifstream f("./data.txt");
        std::string fstring = std::string(std::istreambuf_iterator<char>(f),
                                          std::istreambuf_iterator<char>());
        std::size_t next, pos = 0;
        while((next = fstring.find_first_of(",:\n", pos)) != std::string::npos)
        {
                std::cout.write(&fstring[pos], ++next - pos);
                pos = next;
        }

        std::cout << &fstring[pos] << '\n';

        return 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM