简体   繁体   English

C ++如何使用fstream读取带空格的制表符分隔文件

[英]C++ how to use fstream to read tab-delimited file with spaces

I need to use some C++ code to read a tab-delimited text file. 我需要使用一些C ++代码来读取制表符分隔的文本文件。 The file contains three columns and the second column contains strings with spaces. 该文件包含三列,第二列包含带空格的字符串。 Below are some examples of the file. 以下是该文件的一些示例。

1   hellow world    uid_1
2   good morning    uid_2

The following is the C++ that I need to use to read the file. 以下是我需要用于读取文件的C ++。 However, it can't read the file properly when hitting the space in the string. 但是,当击中字符串中的空格时,它无法正确读取文件。

Any suggestion on modifying the while loop to make it work? 关于修改while循环使其工作的任何建议? I'm not familiar with C++. 我对C ++不熟悉。 Please provide detailed code. 请提供详细代码。 Thanks! 谢谢!

#include <Rcpp.h>
#include <iostream>
#include <fstream>
#include <string>

std::ifstream infile (file_name.c_str());

int row = -1; 
std::string col;
std::string uid;


while (infile >> row >> col >> uid) {

    ### operations on row, col and uid ####

}

Its hard to do this directly. 很难直接做到这一点。 This is because you need to use a combination of formatted( operator>> ) and non-formatted( std::getline ) input routines. 这是因为您需要结合使用格式化( operator>> )和非格式化( std::getline )输入例程。

You want to use operator>> to read the id field (and correctly parse an integer); 您想使用operator>>读取id字段(并正确解析整数); but then you also want to use the function std::getline() , using the third parameter '\\t' , to read a tab delimited field (Note: The field terminator defaults to '\\n' line delimited values). 但是您还想使用函数std::getline() (使用第三个参数'\\t' )来读取制表符分隔的字段(注意:字段终止符默认为'\\n'行分隔值)。

Normally you don't want to use mix the usage of operator>> and std::getline() together because of how they handle white space. 通常,您不希望将operator>>std::getline()的用法混合使用,因为它们如何处理空白。

So the best solution is to write your own input operator and handle that extra space explicitly in a controlled manner. 因此,最好的解决方案是编写自己的输入运算符,并以受控方式显式处理该额外空间。

How to do it: 怎么做:

I would create a class to represent the line. 我将创建一个类来表示该行。

struct Line
{
    int          id;
    std::string  col;
    std::string  uid;

    void swap(Line& other) noexcept {
        using std::swap;
        swap(id, other.id);
        swap(col, other.col);
        swap(uid, other.uid);
    }
    friend std::istream& operator>>(std::istream& in, Line& data);
};

Then you need to define in an input operator for reading the line. 然后,您需要在输入运算符中定义读取行。

std::istream& operator>>(std::istream& in, Line& data)
{
    Line   tmp;
    if (// 1 Read the id. Then disicard leading white space before second field.
        (linestream >> tmp.id >> std::ws) && 
        // 2 Read the second field (which is terminated by tab)
        (std::getline(tmp.col, linestream, '\t') &&
        // 3 Read the third field  (which is terminated by newline)
        (std::getline(tmp.uid, linestream)
        // I am being lazy on 3 you may want to be more specific.
       )
    {
        // We have correctly read all the data we need from
        // the line so set the data object from the tmp value.
        data.swap(tmp);
    }
    return in;
}

Now it can be used easily. 现在可以轻松使用。

Line line;
while (infile >> line) {

    ### operations on row, col and uid ####

}

One would would be as follows: 一种将是如下:

#include <iostream>
#include <vector>
#include <fstream>
#include <iterator>
#include <sstream>

using namespace std;

// take from http://stackoverflow.com/a/236803/248823
void split(const std::string &s, char delim, std::vector<std::string> &elems) {
    std::stringstream ss;
    ss.str(s);
    std::string item;
    while (std::getline(ss, item, delim)) {
        elems.push_back(item);
    }
}

int main() {
    std::ifstream infile ("./data.asc");

    std::string line;



    while (std::getline(infile, line))
    {
        vector<string> row_values;

        split(line, '\t', row_values);

        for (auto v: row_values)
            cout << v << ',' ;

        cout << endl;
     }

    cout << "hello " << endl;
    return 0;
}

Results in: 结果是:

1,hellow world,uid_1,
2,good morning,uid_2,

Note the trailing comma. 请注意尾随逗号。 Not sure what you want to do with the values from the file, so I just made is as simple as possible. 不确定要对文件中的值执行什么操作,因此我所做的就是尽可能简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM