從 c++ 中的 CSV 文件中提取特定數據

Question

我編寫了一個程序來讀取 CSV 文件，但是在從 c++ 中的 CSV 文件中提取數據時遇到了一些麻煩。 我要數數。 從 CSV 文件的第 1 行第 5 列到第 1 行最后一列的列數。 我已經編寫了以下代碼來讀取 CVS 文件，但我不確定如何計算編號。 正如我之前提到的那樣。 如果有人能告訴我，我將如何處理它？

char* substring(char* source, int startIndex, int endIndex)
{
int size = endIndex - startIndex + 1;
char* s = new char[size+1];
strncpy(s, source + startIndex, size); //you can read the documentation of strncpy online
s[size]  = '\0'; //make it null-terminated
return s;
}

char** readCSV(const char* csvFileName, int& csvLineCount)
{
ifstream fin(csvFileName);
if (!fin)
{
    return nullptr;
}
csvLineCount = 0;
char line[1024];
while(fin.getline(line, 1024))
{
    csvLineCount++;
};
char **lines = new char*[csvLineCount];
fin.clear();
fin.seekg(0, ios::beg);
for (int i=0; i<csvLineCount; i++)
{
    fin.getline(line, 1024);
    lines[i] = new char[strlen(line)+1];
    strcpy(lines[i], line);

};
fin.close();
return lines;
}

我附上了 CSV 文件中的幾行：-

省/州,國家/地區,緯度,經度,1/22/20,1/23/20,1/24/20, ,阿富汗,33.0,65.0,0,0,0,0,0,0,0 , ,阿爾巴尼亞,41.1533,20.1683,0,0,0,0

我需要的是，在第一行，Long 之后的日期數。

Answer 1

要回答您的問題：

我附上了 CSV 文件中的幾行：- 省/州，國家/地區，緯度，經度，1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0 ,0,0,0,0,0,0, ,阿爾巴尼亞,41.1533,20.1683,0,0,0,0

我需要的是，在第一行，Long 之后的日期數。

是的，沒那么難——我就是這樣做的：

#include <iostream>
#include <string>
#include <fstream>
#include <regex>

#define FILENAME "test.csv" //Your filename as Macro 
//(The compiler just sees text.csv instead of FILENAME)



void read(){
std::string n;

//date format pattern %m/%dd/%YY
std::regex pattern1("\\b\\d{1}[/]\\d{2}[/]\\d{2}\\b");
//date format pattern %mm/%dd/%YY
std::regex pattern2("\\b\\d{2}[/]\\d{2}[/]\\d{2}\\b");
std::smatch result1, result2;

std::ifstream file(FILENAME, std::ios::in);
    if ( ! file.is_open() )
    {
        std::cout << "Could not open file!" << '\n';
    }

    do{
            getline(file,n,',');
            //https://en.cppreference.com/w/cpp/string/basic_string/getline
            if(std::regex_search(n,result1,pattern1))
                    std::cout << result1.str(1) << n <<  std::endl;
            if(std::regex_search(n,result2,pattern2))
                    std::cout << result2.str(1) << n <<  std::endl;
    }
    while(!file.eof());
    file.close();
}

int main ()
{
    read();
    return 0;
}

文件 test.csv 包含以下用於測試的內容：

    Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0 
    Province/State,Country/Region,Lat,Long,1/25/20,12/26/20,1/27/20, ,Bfghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Blbania,41.1533,20.1683,0,0,0,0

其實很簡單：

getline獲取打開的文件並以所謂的轉義字符“轉義”，在您的情況下為逗號'，'。 （這是我在閱讀 csv 時發現的最好的方法——你可以用你想要的任何東西替換它，例如：';' 或 ' ' 或 '...' - 猜你懂了）
在此之后，您將所有數據很好地分開，沒有逗號。
現在你可以“過濾”出你需要的東西。 我使用正則表達式 - 但使用你想要的任何東西。 （僅供參考：對於 c++ 標記的問題，您不應使用 strncpy 之類的 c 樣式 ..）
我給了你一個 1.23.20 (m/dd/yy) 的例子，如果你的文件包含像 12.22.20 (mm/dd/yy) 這樣的 11 月或 12 月，讓正則表達式模式更容易閱讀/理解2行。

如果數據以某種方式與文件中的日期格式匹配，您可以/可能必須擴展正則表達式模式，這里解釋得很好，並不像看起來那么復雜。

從那時起，您可以將所有打印的東西 fe 放在一個向量（一些更方便的數組）中以處理和/或傳遞/返回數據 - 這取決於您。

如果您需要更多解釋，我很樂意為您提供幫助和/或擴展此示例，請發表評論。

Answer 2

您基本上想在您的行中搜索分隔符 substring（通常是 ';'）。
如果你打印出你的行，它應該是這樣的：

a;b;c;d;e;f;g;h

有幾種方法可以實現您想要的，我會尋找一條帶或拆分字符 function。 下面的示例中的某些內容應該可以工作。 如果你使用 std 你可以 go 用 str.IndexOf 代替循環。

int rows(char* line,char seperator, int count) {
unsigned length = strlen(line);
for (int i=pos; i<length;i++){
  if(strcmp(line[i],seperator)) break;
}
count++;
if (i<length-1) return rows(substring(line,i,length-i),seperator,count);
else return count;
}

遞歸顯然可以用一個循環代替；）

int countSign(char* line, char* sign){
  unsigned l = strlen(line);
  int count = 0;
  for (int i=0; i < l; i++) {
    if(strcmp(line[i],sign)) count++;
  }
}

從 c++ 中的 CSV 文件中提取特定數據

問題描述

2 個解決方案

解決方案1
2 2020-04-21 04:32:05

解決方案2
1 已采納 2020-04-20 14:09:45

從 c++ 中的 CSV 文件中提取特定數據

問題描述

2 個解決方案

解決方案1 2 2020-04-21 04:32:05

解決方案2 1 已采納 2020-04-20 14:09:45

解決方案1
2 2020-04-21 04:32:05

解決方案2
1 已采納 2020-04-20 14:09:45