简体   繁体   English

对待用空格(或制表符)分隔的文件“ like”和数组-C ++

[英]Treat a space(or tab) separated file “like” and array - C++

I have a short tab/space separated file (I can create it accordingly) with the structure 我有一个简短的制表符/空格分隔文件(可以相应地创建),结构如下

[data00] <space> [data01] <space> [data02] <space> [data03] <newline>
[data10] <space> [data11] <space> [data12] <space> [data13] <newline>
...

The first column representing a numerical ID. 第一列代表数字ID。 I create this file to feed it to another executable, so the format is fixed. 我创建此文件以将其提供给另一个可执行文件,因此格式是固定的。 After feeding it, the executable outputs another file with the similar structure: 送入文件后,可执行文件将输出另一个具有类似结构的文件:

[data00] <space> [data01]<newline>
[data10] <space> [data11]<newline>
...

Given an ID, I need to read the corresponding [dataX1] , perform operations on [dataX3] in the first file, feed it back to the executable, and iterate. 给定的ID,我需要读取相应的[dataX1]执行操作[dataX3]在第一文件中,其反馈到可执行,和迭代。

I think of two way of doing this: 我想到了两种方法:

  • Operate on the two textfile "as if" they were array, given that their structure is fixed, but I am lost on what function/syntax to use. 给定两个文本文件的结构是固定的,对其进行操作就好像它们是数组一样,但是对于使用哪种函数/语法我却迷失了。 This should be a small function that would allow me to read the interesting bit by passing it the relevant numeric ID hiding all the pesky I/O code, as I probably need to repeat this operation a lot in different context 这应该是一个小函数,它允许我通过传递相关的数字ID来隐藏所有讨厌的I / O代码来读取有趣的位,因为我可能需要在不同的上下文中重复很多此操作
  • Keep the first file in arrays and trick the executable by feeding it a stream (is this possible? the executable expects a file as argument). 将第一个文件保留在数组中,并通过向可执行文件提供流来欺骗可执行文件(这可能吗?可执行文件需要文件作为参数)。

I could easily read the files into arrays and write the files anew each time, but I want to avoid useless read and write operation, when what I need to read/write is just one cell each time. 我可以很容易地将文件读取到数组中,并每次都重新写入文件,但是我想避免无用的读取和写入操作,因为我每次需要读/写的只是一个单元。 What I don't now how to do is how to stop/identify the interest bit when I read a whole line from the text file by using,say, getline . 我现在不要做的是,当我使用getline从文本文件中读取整行内容时,如何停止/识别兴趣位。

First we will write a function that will split an inputted string based upon a given separator. 首先,我们将编写一个函数,该函数将基于给定的分隔符拆分输入的字符串。 (In this case we will use space.) (在这种情况下,我们将使用空间。)

int split(const std::string& line, const std::string& seperator, std::vector<std::string> * values){
    std::string tString = "";
    unsigned counter = 0;
    for(unsigned l = 0; l < line.size(); ++l){
        for(unsigned i = 0; i < seperator.size(); ++i){
            if(line[l+i]==seperator[i]){
                if(i==seperator.size()-1){
                    values->push_back(tString);
                    tString = "";
                    ++counter;
                }else continue;
            }else{
                tString.push_back(line[l]);
                break;
            }
        }
    }
    return counter;
}

Now we will write ourselves a simple main to read a file, use split to break it up, and then output the data based upon its location within the file. 现在,我们将为自己编写一个简单的主体来读取文件,使用split对其进行分解,然后根据其在文件中的位置输出数据。

int main(){
    std::vector<std::vector<std::string> > lines;
    std::string tString = "";
    std::vector<std::string> tVector;
    std::ifstream fileToLoad;

    fileToLoad.open(FILE_NAME);

    if(fileToLoad.is_open()){
        while(std::getline(fileToLoad,tString)){
            split(tString, " ", &tVector);
            lines.push_back(tVector);
            tVector.clear();
        }

        //Now print our output.
        for(unsigned i1 = 0; i1 < lines.size(); ++i1){
            for(unsigned i2 = 0; i2 < lines[i1].size(); ++i2){
                std::cout<<"["<<i1<<","<<i2<<"] = "<<lines[i1][i2]<<std::endl;
            }
        }
    }else{
        std::cerr<<"FAILED TO OPEN FILE: "<<FILE_NAME<<std::endl;
        return 1;
    }
    return 0;
}

The input file I used has the data: 我使用的输入文件包含以下数据:

450 105 10 10.5 -10.56001 23
10 478 1290 384 1289 3489234 1 2 3 4 5
1 2 3 4 5 6.1 19 -1.5

And the output gives: 输出结果为:

[0,0] = 450
[0,1] = 105
[0,2] = 10
[0,3] = 10.5
[0,4] = -10.56001
[1,0] = 10
[1,1] = 478
[1,2] = 1290
[1,3] = 384
[1,4] = 1289
[1,5] = 3489234
[1,6] = 1
[1,7] = 2
[1,8] = 3
[1,9] = 4
[2,0] = 1
[2,1] = 2
[2,2] = 3
[2,3] = 4
[2,4] = 5
[2,5] = 6.1
[2,6] = 19

Now all that you need to do is use your favorite parsing algorithm to change each string into a double. 现在,您所需要做的就是使用您喜欢的解析算法将每个字符串更改为双精度型。 (strtod, atof, etc) Depending how important optimization is you may also want to modify the container from vector, depending upon your use cases. (strtod,atof等),取决于优化的重要性,您还可能希望根据用例从vector修改容器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM