简体   繁体   中英

How to read CSV data to pointers of struct vector in C++?

I want to read a csv data to vector of struct in cpp, This is what I wrote, I want to store the iris dataset in pointer of struct vector csv std::vector<Csv> *csv = new std::vector<Csv>;

#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>

struct Csv{
    float a;
    float b;
    float c;
    float d;
    std::string e;
};

int main(){
    std::string colname;
    
    // Iris csv dataset downloaded from
    // https://gist.github.com/curran/a08a1080b88344b0c8a7
    std::ifstream *myFile = new std::ifstream("iris.csv");
    

    std::vector<Csv> *csv = new std::vector<Csv>;
    
    std::string line;
    
    // Read the column names
    if(myFile->good())
    {
        // Extract the first line in the file
        std::getline(*myFile, line);

        // Create a stringstream from line
        std::stringstream ss(line);

        // Extract each column name
        while(std::getline(ss, colname, ',')){
            
            std::cout<<colname<<std::endl;
            }
    }
    

   // Read data, line by line
    while(std::getline(*myFile, line))
    {
        // Create a stringstream of the current line
        std::stringstream ss(line);

        
    }
        
    return 0;
}

I dont know how to implement this part of the code which outputs line with both float and string.

   // Read data, line by line
    while(std::getline(*myFile, line))
    {
        // Create a stringstream of the current line
        std::stringstream ss(line);

        
    }

Evolution

We start with you program and complete it with your current programm style. Then we analyze your code and refactor it to a more C++ style solution. In the end we show a modern C++ solution using more OO methods.

First your completed code:

#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>

struct Csv {
    float a;
    float b;
    float c;
    float d;
    std::string e;
};

int main() {
    std::string colname;

    // Iris csv dataset downloaded from
    // https://gist.github.com/curran/a08a1080b88344b0c8a7
    std::ifstream* myFile = new std::ifstream("r:\\iris.csv");


    std::vector<Csv>* csv = new std::vector<Csv>;

    std::string line;

    // Read the column names
    if (myFile->good())
    {
        // Extract the first line in the file
        std::getline(*myFile, line);

        // Create a stringstream from line
        std::stringstream ss(line);

        // Extract each column name
        while (std::getline(ss, colname, ',')) {

            std::cout << colname << std::endl;
        }
    }


    // Read data, line by line
    while (std::getline(*myFile, line))
    {
        // Create a stringstream of the current line
        std::stringstream ss(line);
        // Extract each column 
        std::string column;
        std::vector<std::string> columns{};

        while (std::getline(ss, column, ',')) {
            columns.push_back(column);
        }
        // Convert
        Csv csvTemp{};
        csvTemp.a = std::stod(columns[0]);
        csvTemp.b = std::stod(columns[1]);
        csvTemp.c = std::stod(columns[2]);
        csvTemp.d = std::stod(columns[3]);
        csvTemp.e = columns[4];
        // STore new row data
        csv->push_back(csvTemp);
    }
    // Show everything
    for (const Csv& row : *csv)
        std::cout << row.a << '\t' << row.b << '\t' << row.c << '\t' << row.d << '\t' << row.e << '\n';


    return 0;
}

The question that you have regarding the reading of the columns from your Csv file, can be answered like that:

You need a temporary vector. Then you use the std::getline function, to split the data in the std::istringstream and to copy the resulting substrings into the vector. After that, we use string conversion functions and assign the rsults in a temporary Csv struct variable. After all conversions have been done, we move the temporary into the resulting csv vector that holds all row data.


Analysis of the program.

First, and most important, in C++ we do not use raw pointers for owned memory. We should ven not use new in most case. If at all, std::unique_ptr and std::make_unique should be used.

But we do not need dynamic memory allocation on the heap at all. You can simply define the std::vector on the functions stack. Same like in your line std::string colname;you can also define the std::vector and the std::ifstream as a normal local variable. Like for example std::vector<Csv> csv{}; . Only, if you pass this variable to another function, then use pointers, but smart pointers.

Next, if you open a file, like in std::ifstream myFile("r:\\iris.csv"); you do not need to test the file streams condition with if (myFile->good()) . The std::fstream s bool operator is overwritten, to give you exactly this information. Please see here .

Now, next and most important.

The structure of your source file is well known. There is a header with 5 elements and then 4 doubles and at then end a string without spaces. This makes life very easy.

If we would need to validate the input or if there would be spaces within an string, then we would need to implement other methods. But with this structure, we can use the build in iostream facilities. The snippet

        // Read all data
        Csv tmp{};
        char comma;
        while (myFile >> tmp.a >> comma >> tmp.b >> comma >> tmp.c >> comma >> tmp.d >> comma >> tmp.e)
            csv.push_back(std::move(tmp));

will do the trick. Very simple.

So, the refactored solution could look like this:

#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>

struct Csv {
    float a;
    float b;
    float c;
    float d;
    std::string e;
};

int main() {

    std::vector<Csv> csv{};
    std::ifstream myFile("r:\\iris.csv");
    if (myFile) {
        
        if (std::string header{}; std::getline(myFile, header)) std::cout << header << '\n';

        // Read all data
        Csv tmp{};
        char comma;
        while (myFile >> tmp.a >> comma >> tmp.b >> comma >> tmp.c >> comma >> tmp.d >> comma >> tmp.e)
            csv.push_back(std::move(tmp));

        // Show everything
        for (const Csv& row : csv)
            std::cout << row.a << '\t' << row.b << '\t' << row.c << '\t' << row.d << '\t' << row.e << '\n';
    }
    return 0;
}

This is already much more compact. But there is more. . .


In the next step, we want to add a more Object Oriented approch.

The key is that data and methods, operating on this data, should be encapsulated in an Object / class / struct. Only the Csv struct should know, how to read and write its data.

Hence, we overwrite the extractor and inserter operator for the Csv struct. We use the same approach than before. We just encapsulate the reading and writing in the struct Csv.

After that, the main function will be even more compact and the usage is more logical.

Now we have:

#include <vector>
#include <iostream>
#include <fstream>
#include <string>

struct Csv {
    float a;
    float b;
    float c;
    float d;
    std::string e;

    friend std::istream& operator >> (std::istream& is, Csv& c) {
        char comma;
        return is >> c.a >> comma >> c.b >> comma >> c.c >> comma >> c.d >> comma >> c.e;
    }

    friend std::ostream& operator << (std::ostream& os, const Csv& c) {
        return os << c.a << '\t' << c.b << '\t' << c.c << '\t' << c.d << '\t' << c.e << '\n';
    }
};

int main() {

    std::vector<Csv> csv{};
    if (std::ifstream myFileStream("r:\\iris.csv"); myFileStream) {

        if (std::string header{}; std::getline(myFileStream, header)) std::cout << header << '\n';

        // Read all data
        Csv tmp{};
        while (myFileStream >> tmp)
            csv.push_back(std::move(tmp));

        // Show everything
        for (const Csv& row : csv)
            std::cout << row;
    }
    return 0;
}

OK. Alread rather good. Bit there is even more possible.


We can see that the source data has a header and then Csv data.

Also this can be modelled into a struct. We call it Iris. And we also add an extractor and inserter overwrite to encapsulate all IO-operations.

Additionally we use now modern algorithms, regex, and IO-iterators. I am not sure, if this is too complex now. If you are interested, then I can give you further information. But for now, I will just show you the code.

#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <regex>
#include <iterator>

const std::regex re{ "," };

struct Csv {
    float a;
    float b;
    float c;
    float d;
    std::string e;
    // Overwrite extratcor for simple reading of data
    friend std::istream& operator >> (std::istream& is, Csv& c) {
        char comma;
        return is >> c.a >> comma >> c.b >> comma >> c.c >> comma >> c.d >> comma >> c.e;
    }
    // Ultra simple inserter
    friend std::ostream& operator << (std::ostream& os, const Csv& c) {
        return os << c.a << "\t\t" << c.b << "\t\t" << c.c << "\t\t" << c.d << "\t\t" << c.e << '\n';
    }
};

struct Iris {
    // Iris data consits of header and then Csv Data
    std::vector<std::string> header{};
    std::vector<Csv> csv{};

    // Overwrite extractor for generic reading from streams
    friend std::istream& operator >> (std::istream& is, Iris& i) {
        // First read header values;
        if (std::string line{}; std::getline(is, line)) 
            std::copy(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {}, std::back_inserter(i.header));
        
        // Read all csv data
        std::copy(std::istream_iterator<Csv>(is), {}, std::back_inserter(i.csv));
        return is;
    }

    // Simple output. Copy data to stream os
    friend std::ostream& operator << (std::ostream& os, const Iris& i) {
        std::copy(i.header.begin(), i.header.end(), std::ostream_iterator<std::string>(os, "\t")); std::cout << '\n';
        std::copy(i.csv.begin(), i.csv.end(), std::ostream_iterator<Csv>(os));
        return os;
    }
};


// Driver Code
int main() {

    if (std::ifstream myFileStream("r:\\iris.csv"); myFileStream) {

        Iris iris{};

        // Read all data
        myFileStream >> iris;

        // SHow result 
        std::cout << iris;
    }
    return 0;
}

Look at the main function and how easy it is.

If you have questions, then please ask.


Language: C++17

Compiled and tested with MS Visual Studio 2019, community edition

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM