简体   繁体   中英

simple CSV parser (C++) to deal with commas inside quotes

Seems to be a perenial issue, CSVs. In my case, I have data like this:

"Incident Number","Incident Types","Reported Date","Nearest Populated Centre"
"INC2008-008","Release of Substance","01/29/2008","Fort Nelson"
"INC2008-009","Release of Substance, Adverse Environmental Effects","01/29/2008","Taylor"

I built a parser that makes it into a lovely vector<vector<string>> :

string message = "Loading CSV File...\n";
genericMessage(message);
vector<vector<string>> content;
vector<string> row;
string line, word, block;

vector<string> incidentNoVector;

fstream file(fname, ios::in);
if (file.is_open())
{
    while (getline(file, line))
    {
        row.clear();

        stringstream str(line);

        while (getline(str, word, ','))
            row.push_back(word);
        content.push_back(row);
    }
}
else
    cout << "Could not open the file\n";

but didn't notice the extra comma in some of the data (row 3). Any ideas? I've already built a huge amount of code based on the original vector<vector<string>> expected output, so I really can't afford to change that.

Once I've gotten the vector, I strip the first line out (the header) and place it in it's own object, then put all the remaining rows in a separate object that I can call using [][] .

        // Place the header information in an object, then remove it from the vector
Data_Headers colHeader;
colHeader.setColumn_headers(content[0]);
content.erase(content.begin());

// Place the row data in an object.
Data_Rows allData;
allData.setColumn_data(content);

Row_Key incidentNumbers;
for (int i = 0; i < allData.getColumn_data().size(); i++)
{
    incidentNoVector.push_back(allData.getColumn_data()[i][0]);
}
incidentNumbers.setIncident_numbers(incidentNoVector);

Any help would be hugely appreciated!

If you don't want to use a ready CSV parser library you could create a class that stores the values in one row and overload operator>> for that class. Use std::quoted when reading the individual fields in that operator.

Example:

struct Eater { // A class mainly used for eating up commas in an istream
    char ch;
};

std::istream& operator>>(std::istream& is, const Eater& e) {
    char ch;
    if(is.get(ch) && ch != e.ch) is.setstate(std::istream::failbit);
    return is;
}

struct Row {
    std::string number;
    std::string types;
    std::string date;
    std::string nearest;
};

// Read one Row from an istream
std::istream& operator>>(std::istream& is, Row& r) {
    Eater comma{','};
    is >> std::quoted(r.number) >> comma >> std::quoted(r.types) >> comma
       >> std::quoted(r.date) >> comma >> std::quoted(r.nearest) >> Eater{'\n'};
    return is;
}

// Write one Row to an ostream
std::ostream& operator<<(std::ostream& os, const Row& r) {
    os << std::quoted(r.number) << ',' << std::quoted(r.types) << ','
       << std::quoted(r.date) << ',' << std::quoted(r.nearest) << '\n';
    return os;
}

After you've opened the file, you could then create and populate a std::vector<Row> in a very simple way:

    Row heading;
    if(file >> heading) {
        std::vector<Row> rows(std::istream_iterator<Row>(file),
                              std::istream_iterator<Row>{});
        // All Rows are now in the vector
    }

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM