简体   繁体   中英

Read from comma separated file into vector of objects

I have a done a simple C++ program to gain knowledge in C++. It's a game which stores and reads in the end to a file. Score, Name etc.. At each line in the file the content for a Player object is stored.

Ex: ID Age Name etc.

I now wanted to change to comma separation in the file but then I faced the issue how to read each line and write the Player object into a vector of Player objects std::vector correct.

My code today is like.

std::vector<Player> readPlayerToVector()
{
    // Open the File
    std::ifstream in("players.txt");

    std::vector<Player> players; // Empty player vector

    while (in.good()) {
        Player temp; //
        in >> temp.pID;
        ....
        players.push_back(temp);
    }
    in.close();
    return players;
}

How should I change this code to be compatible with comma separation. Not it works with space separation with the overload of >>.

Be aware that I am a beginner in C++. I've tried looking of the examples where std::getline(ss, line) with stringstream is used but I can't figure out a good way to assign the Player object with that method.

I provided a similar solution here:

read.dat file in c++ and create to multiple data types

#include <iostream>
#include <sstream>
#include <vector>


struct Coefficients {
    unsigned A;
    std::vector<double> B;
    std::vector< std::vector<double> > C;
};

std::vector<double> parseFloats( const std::string& s ) {
    std::istringstream isf( s );
    std::vector<double> res;
    while ( isf.good() ) {
        double value;
        isf >> value;
        res.push_back( value );
    }
    return res;
}

void readCoefficients( std::istream& fs, Coefficients& c ) {
    fs >> c.A;
    std::ws( fs );
    std::string line;
    std::getline( fs, line );
    c.B = parseFloats( line );
    while ( std::getline( fs, line ) ) {
        c.C.push_back( parseFloats( line ) );
    }
}

This one also might apply:

Best way to read a files contents and separate different data types into separate vectors in C++

    std::vector<int> integers;
    std::vector<std::string> strings;

    // open file and iterate
    std::ifstream file( "filepath.txt" );
    while ( file ) {

        // read one line
        std::string line;
        std::getline(file, line, '\n');

        // create stream for fields
        std::istringstream ils( line );
        std::string token;

        // read integer (I like to parse it and convert separated)
        if ( !std::getline(ils, token, ',') ) continue;
        int ivalue;
        try { 
            ivalue = std::stoi( token );
        } catch (...) {
            continue;
        }
        integers.push_back(  ivalue );

        // Read string
        if ( !std::getline( ils, token, ',' )) continue;
        strings.push_back( token );
    }

You could separate each variable by line rather than comma. I find this approach much more simple as you can use the getline function.

Have a read of the documentation of ifstream/ofstream. I've done several projects based of this documentation alone!

C++ fstream reference

I will try to help and explain you all steps. I will first show a little bit of theory and then some easy solution, some alternative solutions and the C++ (object-oriented) approach.

So, we will go from super easy to more modern C++ solution.

Let's start. Assume that you have a of player with some attributes. Attributes could be for example: ID Name Age Score. If you store this data in a file, it could look like:

1  Peter   23   0.98
2  Carl    24   0.75
3  Bert    26   0.88
4  Mike    24   0.95

But at some point in time, we notice that this nice and simple format will not work any longer. The reason is that formatted input functions with the extractor operator >> will stop the conversion at a white space. And this will not work for the following example:

1  Peter Paul    23   0.98
2  Carl Maria    24   0.75
3  Bert Junior   26   0.88
4  Mike Senior   24   0.95

Then the statement fileStream >> id >> name >> age >> score; will not work any longer, and everything will fail. Therefore storing data in a CSV (Comma Separated Values) format is widely chosen.

The file would then look like:

1,  Peter Paul,    23,   0.98
2,  Carl Maria,    24,   0.75
3,  Bert Junior,   26,   0.88
4,  Mike Senior,   24,   0.95

And with that, we can clearly see, what value belongs to which attribute. But unfortunately, this will make reading more difficult. Because you do need to follow 3 steps now:

  1. Read a complete line as a std::string
  2. Split this string into substrings using the comma as a separator
  3. Convert the substrings to the required format, for example from string to number age

So, let us solve this step by step.

Reading a complete line is easy. For this we have the function std::getline . It will read a line (at text until the end of the line character '\n') from a stream (from any istream, like std::cin , an std::ifstream or also from an std::istringstream ) and store it in a std::string variable. Please read a description of the function in the CPP Reference here .

Now, splitting a CSV string in its parts. There are so many methods available, that it is hard to tell what is the good one. I will also show several methods later, but the most common approach is done with std::getline . (My personal favorite is the std::sregex_token_iterator , because it fits perfectly into the C++ algorithm world. But for here, it is too complex).

OK, std::getline . As you have read in the CPP reference, std::getline reads characters until it finds a delimiter. If you do not specify a delimiter, then it will read until the end of line \n . But you can also specify a different delimiter. And this we will do in our case. We will choose the delimiter ','.

But, additional problem, after reading a complete line in step 1, we have this line in a std::string . And, std::getline wants to read from a stream. So, the std::getline with a comma as delimiter cannot be used with a std::string as source. Fortunately also here is a standard approach available. We will convert the std::string into a stream, by using a std::istringstream . You can simply define a variable of this type and pass the just read string as parameter to its constructor. For example:

std::istringstream iss(line);

And now we can use all iostream functions also with this “iss”. Cool. We will use std::getline with a ',' delimiter and receive a substring.

The 3rd and last is unfortunately also necessary. Now we have a bunch of substrings. But we have also 3 numbers as attributes. The “ID” is an unsigned long , the “Age” is an int and the “Score” is a double , So we need to use string conversion functions to convert the substring to a number: std::stoul , std::stoi and std::stod . If the input data is always OK, then this is OK, but if we need to validate the input, then it will be more complicated. Let us assume that we have a good input.

Then, one of really many many possible examples:

#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>

struct Player {
    unsigned long ID{};
    std::string name{};
    int age{};
    double score{};
};

// !!! Demo. All without error checking !!!
int main() {

    // Open the source CSV file
    std::ifstream in("players.txt");

    // Here we will store all players that we read
    std::vector<Player> players{};

    // We will read a complete line and store it here
    std::string line{};

    // Read all lines of the source CSV file
    while (std::getline(in, line)) {
        
        // Now we read a complete line into our std::string line
        // Put it into a std::istringstream to be able to extract it with iostream functions
        std::istringstream iss(line);

        // We will use a vector to store the substrings
        std::string substring{};
        std::vector<std::string> substrings{};
        
        // Now, in a loop, get the substrings from the std::istringstream
        while (std::getline(iss, substring, ',')) {

            // Add the substring to the std::vector
            substrings.push_back(substring);
        }
        // Now store the data for one player in a Player struct
        Player player{};
        player.ID = std::stoul(substrings[0]);
        player.name = substrings[1];
        player.age = std::stoi(substrings[2]);
        player.score = std::stod(substrings[3]);

        // Add this new player to our player list
        players.push_back(player);
    }

    // Debug output
    for (const Player& p : players) {
        std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t'  << p.score << '\n';
    }
}

You see, it is getting more complex.

If you are more experienced, then you can use also other mechanisms. But then you need to understand the difference between formatted an unformatted input and need to have little bit more practice. This is complex. (So, do not use that in the beginning):

#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>

struct Player {
    unsigned long ID{};
    std::string name{};
    int age{};
    double score{};
};

// !!! Demo. All without error checking !!!
int main() {

    // Open the source CSV file
    std::ifstream in("r:\\players.txt");

    // Here we will store all players that we read
    Player player{};
    std::vector<Player> players{};
    
    char comma{}; // Some dummy for reading a comma

    // Read all lines of the source CSV file
    while (std::getline(in >> player.ID >> comma >> std::ws, player.name, ',') >> comma >> player.age >> comma >> player.score) {

        // Add this new player to our player list
        players.push_back(player);
    }
        // Debug output
    for (const Player& p : players) {
        std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score << '\n';
    }
}

As said, do not use in the beginning.

But, what you should try to learn and understand is: C++ is an object oriented language. This means we do not only put the data into the Player struct, but also the methods that operate on this data.

And those are at the moment just input and output. And as you already know, input and output is done using iostream-functionality with the extractor operator >> and inserter operator << . But, how to do this? Our Player struct is a custom type. It has no build in >> and << operator.

Fortunately, C++ is a powerful language and allows us to add such functionality easily.

The signature of the struct would then look like:

struct Player {

    // The data part
    unsigned long ID{};
    std::string name{};
    int age{};
    double score{};

    // The methods part
    friend std::istream& operator >> (std::istream& is, Player& p);
    friend std::ostream& operator << (std::ostream& os, const Player& p);
};

And, after writing the code for these operators using the above-mentioned method, we will get:

#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>

struct Player {

    // The data part
    unsigned long ID{};
    std::string name{};
    int age{};
    double score{};

    // The methods part
    friend std::istream& operator >> (std::istream& is, Player& p) {
        std::string line{}, substring{}; std::vector<std::string> substrings{};
        std::getline(is, line);
        std::istringstream iss(line);
        // Read all substrings
        while (std::getline(iss, substring, ','))
            substrings.push_back(substring);
        // Now store the data for one player in the given  Player struct
        Player player{};
        p.ID = std::stoul(substrings[0]);
        p.name = substrings[1];
        p.age = std::stoi(substrings[2]);
        p.score = std::stod(substrings[3]);
        return is;
    }
    friend std::ostream& operator << (std::ostream& os, const Player& p) {
        return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
    }
};

// !!! Demo. All without error checking !!!
int main() {

    // Open the source CSV file
    std::ifstream in("r:\\players.txt");

    // Here we will store all players that we read
    Player player{};
    std::vector<Player> players{};


    // Read all lines of the source CSV file into players
    while (in >> player) {

        // Add this new player to our player list
        players.push_back(player);
    }

    // Debug output
    for (const Player& p : players) {
        std::cout << p << '\n';
    }
}

It is simply reusing everything from what we learned above. Just put it at the right place.

We can even go one step ahead. Also the player list, the ste::vector<Player> can be wrapped in a class and amended with iostream-functionality.

By knowing all of the above, this will be really simple now. See:

#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>

struct Player {

    // The data part
    unsigned long ID{};
    std::string name{};
    int age{};
    double score{};

    // The methods part
    friend std::istream& operator >> (std::istream& is, Player& p) {
        char comma{}; // Some dummy for reading a comma
        return std::getline(is >> p.ID >> comma >> std::ws, p.name, ',') >> comma >> p.age >> comma >> p.score;
    }
    friend std::ostream& operator << (std::ostream& os, const Player& p) {
        return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
    }
};

struct Players {

    // The data part
    std::vector<Player> players{};

    // The methods part
    friend std::istream& operator >> (std::istream& is, Players& ps) {
        Player player{};
        while (is >> player) ps.players.push_back(player);
        return is;
    }
    friend std::ostream& operator << (std::ostream& os, const Players& ps) {
        for (const Player& p : ps.players) os << p << '\n';
        return os;
    }
};

// !!! Demo. All without error checking !!!
int main() {

    // Open the source CSV file
    std::ifstream in("players.txt");

    // Here we will store all players that we read
    Players players{};

    // Read the complete CSV file and store everything in the players list at the correct place
    in >> players;

    // Debug output of complete players data. Ultra short.
    std::cout << players;
}

I would be happy, if you could see the simple and yet powerful solution.

At the very end, as promised. Some further methods to split a string into substrings:

Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.

Alternatives

  1. Handcrafted, many variants, using pointers or iterators, maybe hard to develop and error prone.
  2. Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
  3. std::getline . Most used implementation. But actually a "misuse" and not so flexible
  4. Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower.

Please see 4 examples in one piece of code.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>

using Container = std::vector<std::string>;
std::regex delimiter{ "," };


int main() {

    // Some function to print the contents of an STL container
    auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
        std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };

    // Example 1:   Handcrafted -------------------------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Search for comma, then take the part and add to the result
        for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {

            // So, if there is a comma or the end of the string
            if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {

                // Copy substring
                c.push_back(stringToSplit.substr(startpos, i - startpos));
                startpos = i + 1;
            }
        }
        print(c);
    }

    // Example 2:   Using very old strtok function ----------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
        for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
            c.push_back(token);
        }

        print(c);
    }

    // Example 3:   Very often used std::getline with additional istringstream ------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Put string in an std::istringstream
        std::istringstream iss{ stringToSplit };

        // Extract string parts in simple for loop
        for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
            ;

        print(c);
    }

    // Example 4:   Most flexible iterator solution  ------------------------------------------------

    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };


        Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
        //
        // Everything done already with range constructor. No additional code needed.
        //

        print(c);


        // Works also with other containers in the same way
        std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});

        print(c2);

        // And works with algorithms
        std::deque<std::string> c3{};
        std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));

        print(c3);
    }
    return 0;
}

Happy coding!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM