简体   繁体   中英

C++ Using getline() inside loop to read in CSV file

I'm trying to read in a CSV file that contains rows of 3 people/patients, where col 1 is userid, col 2 is fname, col 3 is lname, col 4 is insurance, and col 5 is version that looks something like below.

Edit: Apologies, I simply copy/pasted my CSV spreadsheet in here, so it didn't show the commas before. Wouldn't it look something more like below? John below also pointed out that there are no commas after the version, and this seemed to fix the issue! Thanks so much John! ( trying to figure out how I can accept your answer :) )

nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3

I'm trying to use getline() inside of a loop to read everything in, and it works fine for the first iteration, but getline() seems to be causing it to skip a value on the next iterations. Any idea how I can solve this?

I'm also not sure why the output looks like below, because I'm not seeing where the lines w/ "sc91" and "ml94" are being printed in the code. This is what the output of the current code looks like.

userid is: nm92
fname is: Nate
lname is: Matthews
insurance is: Aetna
version is: 1
sc91
userid is: Steve
fname is: Combs
lname is: Cigna
insurance is: 2
ml94
version is: Morgan
userid is: Lands
fname is: BCBS
lname is: 3

insurance is:
version is:

I've done a ton of research on differences between getline() and the >> stream operator, but most of the getline() materials seem to revolve around getting input from cin rather than reading from a file like here, so I'm thinking there's something going on w/ getline() and how it's reading the file that I'm not understanding. Unfortunately when I tried >> operator, that forces me to use the strtok() function, and I was struggling a lot with c strings and assigning them to an array of C++ strings.

#include <iostream>
#include <string>                               // for strings
#include <cstring>                              // for strtok()
#include <fstream>                              // for file streams

using namespace std;

struct enrollee
{
    string userid = "";
    string fname = "";
    string lname = "";
    string insurance = "";
    string version = "";
};

int main()
{
    const int ENROLL_SIZE = 1000;               // used const instead of #define since the performance diff is negligible,
    const int numCols = 5;                    // while const allows for greater utility/debugging bc it is known to the compiler ,
                                                // while #define is a preprocessor directive
    ifstream inputFile;                         // create input file stream for reading only
    struct enrollee enrollArray[ENROLL_SIZE];   // array of structs to store each enrollee and their respective data
    int arrayPos = 0;

    // open the input file to read
    inputFile.open("input.csv");
    // read the file until we reach the end
    while(!inputFile.eof())
    {
        //string inputBuffer;                         // buffer to store input, which will hold an entire excel row w/ cells delimited by commas
                                                    // must be a c string since strtok() only takes c string as input
        string tokensArray[numCols];
        string userid = "";
        string fname = "";
        string lname = "";
        string insurance = "";
        string sversion = "";
        //int version = -1;

        //getline(inputFile,inputBuffer,',');
        //cout << inputBuffer << endl;

        getline(inputFile,userid,',');
        getline(inputFile,fname,',');
        getline(inputFile,lname,',');
        getline(inputFile,insurance,',');
        getline(inputFile,sversion,',');

        enrollArray[0].userid = userid;
        enrollArray[0].fname = fname;
        enrollArray[0].lname = lname;
        enrollArray[0].insurance = insurance;
        enrollArray[0].version = sversion;

        cout << "userid is: " << enrollArray[0].userid << endl;
        cout << "fname is: " << enrollArray[0].fname << endl;
        cout << "lname is: " << enrollArray[0].lname << endl;
        cout << "insurance is: " << enrollArray[0].insurance << endl;
        cout << "version is: " << enrollArray[0].version << endl;
    }
}

That is only an idea, but it could help you. It's a piece of code of one project I am working on:

std::vector<std::string> ARDatabase::split(const std::string& line, char delimiter)
{
    std::vector<std::string> tokens;
    std::string token;
    std::istringstream tokenStream(line);
    while (std::getline(tokenStream, token, delimiter))
    {
        tokens.push_back(token);
    }
    return tokens;
}

void ARDatabase::read_csv_map(std::string root_csv_map)
{
    qDebug() << "Starting to read the people database...";
    std::ifstream file(root_csv_map);
    std::string str;
    while (std::getline(file, str))
    {
        std::vector<std::string> tokens = split(str, ' ');
        std::vector<std::string> splitnames = split(tokens.at(1), '_');

        std::string name_w_spaces;
        for(auto i: splitnames) name_w_spaces = name_w_spaces + i + " ";

        people_names.insert(std::make_pair(stoi(tokens.at(0)), name_w_spaces));
        people_images.insert(std::make_pair(stoi(tokens.at(0)), std::string("database/images/" + tokens.at(2))));

    }
}

Instead of std::vector, you might want to use other container more suitable for your case. And the last example is made for the input format of my case. You can modify it easily for adapting it to your code.

Your problem is that there is no comma after the final data item in each line, so

 getline(inputFile,sversion,',');

is incorrect because it reads to the next comma, which is actually on the next line after the user id of the next patient. This explains the output you see where the user id of the next patent gets output with the version.

To fix this simply replace the code above with

 getline(inputFile,sversion);

which will read to the end of line as required.

Regarding your function. If you look at the structure of the source file, then you will see that it contains 5 strings, separated by ",". So a typical CSV file.

A call to std::getline will read a complete line with the 5 strings. In your code you are trying to call std::getline for each single string, followed by a comma. Commaa is not present after the last string. That will not work. You should also use getline to get a complete line.

You need to read the whole line and then tokenize it.

I will show you an example on how to do that with the std::sregex_token_iterator . That is very simple. Additionally, we will overwrite the inserter and extracot operator. With that, you can easiyl read and write "enrollee" data like Enrollee e{}; std::cout << e; Enrollee e{}; std::cout << e;

Additionally I use C++ algorithms. That makes life very easy. Input and Output are a one-liner in main.

Please see:

#include <iostream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <regex>


struct Enrollee
{
    // Data
    std::string userid{};
    std::string fname{};
    std::string lname{};
    std::string insurance{};
    std::string version{};

    // Overload Extractor Operator to read data from somewhere
    friend std::istream& operator >> (std::istream &is, Enrollee& e) {
        std::vector<std::string> wordsInLine{};       // Here we will store all words that we read in onle line;
        std::string wholeLine;                        // Temporary storage for the complete line that we will get by getline
        std::regex separator("[ \\;\\,]"); ;          // Separator for a CSV file
        std::getline(is, wholeLine);                  // Read one complete line and split it into parts
        std::copy(std::sregex_token_iterator(wholeLine.begin(), wholeLine.end(), separator, -1), std::sregex_token_iterator(), std::back_inserter(wordsInLine));
        // If we have read all expted strings, then store them in our struct
        if (wordsInLine.size() == 5) {
            e.userid = wordsInLine[0];
            e.fname = wordsInLine[1];
            e.lname = wordsInLine[2];
            e.insurance = wordsInLine[3];
            e.version = wordsInLine[4];
        }
        return is;
    }

    // Overload Inserter operator. Insert data into output stream
    friend std::ostream& operator << (std::ostream& os, const Enrollee& e) {
        return os << "userid is:    " << e.userid << "\nfname is:     " << e.fname << "\nlname is:     " << e.lname << "\ninsurance is: " << e.insurance << "\nversion is:   " << e.version << '\n';
    }
};


int main()
{
    // Her we will store all Enrollee data in a dynamic growing vector
    std::vector<Enrollee> enrollmentData{};

    // Define inputFileStream and open the csv
    std::ifstream inputFileStream("r:\\input.csv");

    // If we could open the file
    if (inputFileStream) {

        // Then read all csv data
        std::copy(std::istream_iterator<Enrollee>(inputFileStream), std::istream_iterator<Enrollee>(), std::back_inserter(enrollmentData));

        // For Debug Purposes: Print all data to cout
        std::copy(enrollmentData.begin(), enrollmentData.end(), std::ostream_iterator<Enrollee>(std::cout, "\n"));
    }
    else {
        std::cerr << "Could not open file 'input.csv'\n";
    }
}

This will read the input file "input.csv" containing

nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3

And show as output:

userid is:    nm92
fname is:     Nate
lname is:     Matthews
insurance is: Aetna
version is:   1

userid is:    sc91
fname is:     Steve
lname is:     Combs
insurance is: Cigna
version is:   2

userid is:    ml94
fname is:     Morgan
lname is:     Lands
insurance is: BCBS
version is:   3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM