简体   繁体   English

读取一个csv文件并将其所有数据添加到C++中的向量中

[英]read a csv file and and add its all data into vector in c++

For example to add the following CSV data:例如添加以下 CSV 数据:

在此处输入图片说明

I am trying to add CSV file into a 2D array string vector and get the sum of each column.我正在尝试将 CSV 文件添加到二维数组字符串向量中并获取每列的总和。 The following program didn't work properly,以下程序无法正常工作,

vector<string> read_csv(string filename){

    vector<string> result;
    fstream fin;
    fin.open(filename, ios::in);

    if(!fin.is_open())
        throw std::runtime_error("Could not open file");

    std::string line, colname;
    int val;

    // Read the column names
    if(fin.good())
    {
        std::getline(fin, line);
        std::stringstream ss(line);
        while(std::getline(ss, colname, ',')){
            result.push_back(colname);
            cout << colname << endl;
        }
    }

    while(std::getline(fin, line))
    {
        std::stringstream ss(line);
        int colIdx = 0;
        while(ss >> val){

            if(ss.peek() == ',') ss.ignore();
            colIdx++;
        }
    }
    fin.close();
    return result;
}

when I tried to go through the vector, I didn't get a proper result.当我试图通过向量时,我没有得到正确的结果。 It showed only the column names.它只显示列名。

for (int i = 0; i < vectorCsv.size(); ++i) 
{
        cout << vectorCsv[i] << endl;
}

I couldn't find whether the error is in read_csv() function or in the forloop.我找不到错误是在 read_csv() 函数中还是在 forloop 中。 Thank you for looking at this problem.谢谢你看这个问题。

In your while loop, you never pushed any values to your vector.在您的 while 循环中,您从未将任何值推送到您的向量。

It looks like you have everything you need to read the csv into a vector right here.看起来您拥有将 csv 读入向量所需的一切。 Only problem is you stopped at column names.唯一的问题是您停在列名处。

// Read the column names
    if(fin.good())
    {
        std::getline(fin, line);
        std::stringstream ss(line);
        while(std::getline(ss, colname, ',')){
            result.push_back(colname);
            cout << colname << endl;
        }
    }

Try changing the code I copied above to:尝试将我上面复制的代码更改为:

// Read the column names
    while(std::getline(fin, line))
    {
        std::getline(fin, line);
        std::stringstream ss(line);
        while(std::getline(ss, colname, ',')){
            result.push_back(colname);
            cout << colname << endl;
        }
    }
  1. Don't try to create vectors of std::string s, that's probably not very efficient - each string being allocated and de-allocated separately.不要尝试创建std::string的向量,这可能不是很有效 - 每个字符串被分别分配和取消分配。
  2. Don't read CSV's yourself - you're reinventing the wheel.不要自己阅读 CSV - 你在重新发明轮子。 Use an existing library.使用现有的库。 Here's a question about finding one at Software Recommendations StackExchange:这是一个关于在 Software Recommendations StackExchange 中找到一个的问题:

    Modern C++ CSV reader (and maybe writer) library现代 C++ CSV 阅读器(也可能是编写器)库

I cannot believe that we are using a library for such an ultra simple thing like splitting a std::string into tokens.我不敢相信我们正在使用一个库来完成如此简单的事情,比如将std::string拆分为标记。

C++ has, since long, a build in and dedicated functionality, specifically designed for this purpose, to tokenize strings (split strings into tokens).长期以来,C++ 具有专门为此目的而设计的内置和专用功能,用于对字符串进行标记(将字符串拆分为标记)。 And because such a simple dedicated function, designed for this purpose, is available, it simply should be used.并且因为这样一个专为此目的而设计的简单专用功能可用,所以应该使用它。 There is no need for external libraries or complicated constructs.不需要外部库或复杂的结构。 Simply use the std::sregex_token_iterator .只需使用std::sregex_token_iterator

This is an iterator (like many other iterators), that iterates over tokens (sub-strings) of a string.这是一个迭代器(像许多其他迭代器一样),它迭代字符串的标记(子字符串)。 So, what we want.所以,我们想要什么。

We can then use the std::vector s range constructor to write something simple like this:然后我们可以使用std::vector的 range 构造函数来编写如下简单的内容:

std::vector tokens(std::sregex_token_iterator(line.begin(), line.end(), delimiter, -1), {}));

So, we define a variable with the name "tokens" of type std::vector (with CTAD the type of the vector is automatically deduced).因此,我们定义了一个名为“tokens”的std::vector类型的变量(使用 CTAD 会自动推导出向量的类型)。 We use its range constructor and provide a begin and an end iterator.我们使用它的范围构造函数并提供一个开始和一个结束迭代器。 The begin iterator is the std::sregex_token_iterator and the end-iterator is its default-initialized counterpart.开始迭代器是std::sregex_token_iterator而结束迭代器是其默认初始化的对应物。

To put such a vector into a 2D Vector, we use the outer vectors emplace_back function and do an inplace construction for the inner vector.为了将这样的向量放入二维向量中,我们使用外部向量emplace_back函数并对内部向量进行就地构造。

So you read the whole CSV-File with 2 statements所以你用 2 个语句阅读了整个 CSV 文件

  • a simple for loop一个简单的 for 循环
  • a simple emplace back with the std::sregex_token_iterator使用std::sregex_token_iterator返回一个简单的std::sregex_token_iterator
        // We will read all lines of the source file with a simple for loop and std::getline
        for (std::string line{}; std::getline(csvFile, line); ) {

            // We will split the one big string into tokens (sub-strings) and add it to our 2D array
            csvData.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), delimiter, -1), {}));
        }

So, why should you use a library for such a simple task that you can do with 2 statements?那么,为什么要使用一个库来完成一个可以用 2 个语句完成的简单任务呢? I personally fail to understand that.我个人无法理解这一点。 Therefore, I find that the advise in the accepted answer is flat wrong.因此,我发现已接受答案中的建议完全错误。 But, to avoid starting religious discussions: This is my very personal humble opinion and everybody can do what he wants.但是,为了避免开始宗教讨论:这是我非常个人的拙见,每个人都可以为所欲为。

Please see a complete working example, which solves your problem, with just a few lines of code .请查看一个完整的工作示例,它只需几行代码即可解决您的问题。 . . . .

#include <iostream>
#include <fstream>
#include <vector>
#include <regex>

const std::string csvFileName{ "r:\\csv.csv" };
const std::regex delimiter{ "," };

int main() {

    // Open the file and check, if it could be opened
    if (std::ifstream csvFile(csvFileName); csvFile) {

        // This is our "2D array string vector" as described in your post
        std::vector<std::vector<std::string>> csvData{};


        // Read the complete CSV FIle into a 2D vector ----------------------------------------------------
        // We will read all lines of the source file with a simple for loop and std::getline
        for (std::string line{}; std::getline(csvFile, line); ) {

            // We will split the one big string into tokens (sub-strings) and add it to our 2D array
            csvData.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), delimiter, -1), {}));
        }
        // -------------------------------------------------------------------------------------------------


        // This is for summing up values
        double DP{}, Dta{}, Dts{};

        // Iterate in a simple for loop through all elements of the 2D vector, convert the vlaues to double and sum them up
        for (size_t i = 1U; i < csvData.size(); ++i) {

            DP += std::stod(csvData[i].at(1));
            Dta += std::stod(csvData[i].at(2));
            Dts += std::stod(csvData[i].at(3));
        }

        // Sho the result to the user
        std::cout << "\nSums:  DP: " << DP << "  Dta: " << Dta << "  Dts: " << Dts << "\n";
    }
    else { // In case that we could not open the source file
        std::cerr << "\n*** Error. Could not open file " << csvFileName << "\n\n";
    }
    return 0;
}

But as said, everybdoy can do whatever he wants.但正如所说,每个人都可以为所欲为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM