简体   繁体   English

从csv文件读取并分成变量

[英]Read from csv file and separate into variable

I'm trying to separate my input values into 2 different categories. 我正在尝试将输入值分为2个不同的类别。 The first array call teamname would hold the the team names and the second array would hold the score for that week. 第一个数组调用团队名称将保留团队名称,第二个数组将保存该周的分数。 My input file is .csv with the code the way it is everything is stored in the as a string instead of 2 separate variables. 我的输入文件是.csv,其中的代码是将所有内容存储为字符串而不是2个单独变量的方式。 Also I'm not to program savvy and am only familiar with the library. 另外,我不会精通程序,只熟悉该库。

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>

#define FILEIN "data.csv"
#define FILEOUT "matrix.csv"

int main (void)
{
    double nfl[32][32], teamscore[32];
    char teamname[30];
    int n;
    FILE *filein_ptr;
    FILE *fileout_ptr;

    filein_ptr = fopen (FILEIN, "r");
    fileout_ptr = fopen (FILEOUT, "w");

    for (n = 1; n <= 32; n++) {
        fscanf (filein_ptr, "%s  %lf\n", &teamname, &teamscore[n]);
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore);
    }

    fclose (filein_ptr);
    fclose (fileout_ptr);

    return 0;
}

I should say that the input file has the first column with team names and the second column with team scores. 我应该说输入文件的第一列包含团队名称,第二列包含团队得分。 Any help would be great. 任何帮助都会很棒。 Thanks! 谢谢! Here is a sample input file 这是一个示例输入文件

  • Steelers,20 钢人队,20
  • Patriots,25 爱国者,25
  • Raiders,15 攻略,15
  • Chiefs,35 酋长,35

In addition to changing &teamname to teamname , there are a few other considerations you may want to look at. 除了将&teamname更改为teamname ,您还需要考虑其他一些注意事项。 The first being, always initialize your variables. 首先,总是初始化变量。 While not required, this has a number of positive benefits. 尽管不是必需的,但这具有许多积极的好处。 For numerical arrays, it initializes all elements preventing an accidental read from an uninitialized value. 对于数字数组,它将初始化所有元素,以防止意外读取未初始化的值。 For character arrays, initializing to 0 insures that the first copy to the string (less than the total length) will be null-terminated and also prevents an attempted read from an uninitialized value. 对于字符数组,初始化为0可以确保字符串的第一个副本(小于总长度)将以null-terminated并且还可以防止尝试从未初始化的值进行读取。 It's just good habit: 这只是个好习惯:

    double teamscore[MAXS] = {0.0};
    char teamname[30] = {0};
    int n = 0;

You have defined default values for your filein_ptr and fileout_ptr , you can do the same for your array sizes. 您已经为filein_ptrfileout_ptr定义了默认值,您可以对数组大小执行相同的操作。 That makes your code easier to maintain by providing a single value to update if your array size needs change. 如果需要更改数组大小,只需提供一个值即可更新,从而使代码的维护更加容易。

Next, and this is rather a nit, but an important nit. 接下来,这是一个尼特,但很重要。 main accept arguments, defined by standard as int argc, char **argv (you may also see an char **envp on Unix systems, you may seem them both written in equivalent form char *argv[] and char *envp[] ). main接受参数,按标准将其定义为int argc, char **argv (您可能还会在Unix系统上看到char **envp ,您可能看起来它们都以等价形式char *argv[]char *envp[] ) 。 The point here is to use them to take arguments for your program so you are not stuck with just your hardcoded data.csv and matrix.csv filenames. 这里的重点是使用它们为您的程序采用参数,这样您就不会只局限于硬编码的data.csvmatrix.csv文件名。 You can use your hardcoded values and still provided the user the ability to enter filenames of his choice by using a simple ternary operator (eg test ? if true code : if false code; ): 您可以使用硬编码的值,并且仍然向用户提供通过使用简单的ternary运算符输入其选择的文件名的能力(例如test ? if true code : if false code; ):

    FILE *filein_ptr = argc > 1 ? fopen (argv[1], "r") : fopen (FILEIN, "r");
    FILE *fileout_ptr = argc > 2 ? fopen (argv[2], "w") : fopen (FILEOUT, "w");

There, the test argc > 1 (meaning there is at least one argument given by the user), if true code open (argv[1], "r") (open the filename given as the argument for reading, and if false code fopen (FILEIN, "r") open your default if not filename given. The same holds true for your output file. (you must provide them in the correct order). 此处, 测试 argc > 1 (表示用户至少提供了一个参数), 如果 open (argv[1], "r") 真实代码 open (argv[1], "r") (打开了作为读取参数给出的文件名,并且返回了错误代码)如果没有给出文件名fopen (FILEIN, "r")打开默认值。输出文件也是如此(必须以正确的顺序提供)。

Then if you open a file, you must validate that the file is actually open before you attempt to read from it. 然后,如果您打开一个文件,则在尝试读取该文件之前,必须验证该文件是否已实际打开。 While you can test the input and output separately to tell which one failed, you can also use a simple || 虽然您可以分别测试输入和输出以判断哪个失败,但是也可以使用简单的|| condition to check if either open failed: 检查是否打开失败的条件:

    if (!filein_ptr || ! fileout_ptr) {
        fprintf (stderr, "error: filein of fileout open failed.\n");
        return 1;
    }

Lastly, if you know the number of lines of data you need to read, an indexed for loop as you have is fine, but you will rarely know the number of lines in a data file before hand. 最后,如果您知道需要读取的数据行数,则可以使用索引的for循环就可以了,但是在使用之前您几乎不会知道数据文件中的行数。 Even if using a for loop, you still need to check the return of fscanf to verify that you actually had 2 valid conversion (and therefore got 2 values you were expecting). 即使使用for循环,您仍然需要检查fscanf的返回值,以验证您实际上进行了2次有效转换(因此获得了您期望的2个值)。 Checking the return also provides another benefit. 检查退货还提供了另一个好处。 It allows you to continue reading until you no longer get 2 valid conversions from fscanf . 它使您可以继续阅读,直到不再从fscanf获得2次有效转换为止。 This provides an easy way to read an unknown number of values from a file. 这提供了一种从文件读取未知数量的值的简便方法。 However, you do need to insure you do not try and read more values into your array than they will hold. 但是,您确实需要确保不要尝试将更多的值读入数组而无法容纳它们。 eg: 例如:

    while (fscanf (filein_ptr, " %29[^,],%lf", teamname, &teamscore[n]) == 2) {
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore[n++]);
        if (n == MAXS) {  /* check data doesn't exceed MAXS */
            fprintf (stderr, "warning: data exceeds MAXS.\n");
            break;
        }
    }

note: when using a format specifier that contains a character case (like "%[^,], ..." ), be aware it will read and include leading and trailing whitespace in the conversion to string. 注意:当使用包含字符大小写的格式说明符(例如"%[^,], ..." )时,请注意它将在转换为字符串的过程中读取并包括前导和尾随空格。 So if your file has ' Steelers ,..' , teamname will include the whitespace. 所以,如果你的文件有' Steelers ,..'teamname将包括空白。 You can fix the leading whitespace by including a space before the start of the conversion (like " %29[^,], ..." ) and also limit the number of characters that can be read by specifying a maximum field width. 您可以通过在转换开始之前添加一个空格来修复前导空格(例如" %29[^,], ..." ),并通过指定最大字段宽度来限制可以读取的字符数。 (a trailing whitespace in the case would be easier trimmed after the read) (在这种情况下,尾部的空白将在读取后更容易修剪)

Putting all the pieces together, you could make your code more flexible and reliable by taking arguments from the user, and validating your file and read operations: 将所有部分放在一起,可以通过从用户处获取参数并验证文件和读取操作来使代码更灵活,更可靠:

#define _CRT_SECURE_NO_WARNINGS 1
#include <stdio.h>

#define FILEIN "data.csv"
#define FILEOUT "matrix.csv"
#define MAXS 32

int main (int argc, char **argv)
{
    /* double nfl[MAXS][MAXS] = {{0}}; */
    double teamscore[MAXS] = {0.0};
    char teamname[30] = {0};
    int n = 0;
    FILE *filein_ptr = argc > 1 ? fopen (argv[1], "r") : fopen (FILEIN, "r");
    FILE *fileout_ptr = argc > 2 ? fopen (argv[2], "w") : fopen (FILEOUT, "w");

    if (!filein_ptr || ! fileout_ptr) {
        fprintf (stderr, "error: filein of fileout open failed.\n");
        return 1;
    }

    while (fscanf (filein_ptr, " %29[^,],%lf", teamname, &teamscore[n]) == 2) {
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore[n++]);
        if (n == MAXS) {  /* check data doesn't exceed MAXS */
            fprintf (stderr, "warning: data exceeds MAXS.\n");
            break;
        }
    }

    fclose (filein_ptr);
    fclose (fileout_ptr);

    return 0;
}

Test Input 测试输入

$ cat ../dat/teams.txt
Steelers,   20
Patriots,25
    Raiders,    15
    Chiefs,35

note: the variations in leading whitespace and whitespace between values was intentional. 注意:值之间的前导空白和空白的变化是有意的。

Use/Output 使用/输出

$ ./bin/teams ../dat/teams.txt teamsout.txt

$ cat teamsout.txt
Steelers    20.000000
Patriots    25.000000
Raiders    15.000000
Chiefs    35.000000

Let me know if you have further questions. 如果您还有其他问题,请告诉我。

If you are going to store the team names in an array you should declare a two dimensional array: 如果要将团队名称存储在数组中,则应声明一个二维数组:

char team_names[N_OF_TEAMS][MAX_CHAR_IN_NAME];

Then, you declare the array for the score. 然后,您声明得分的数组。 You are using doubles to store the score, aren't them only integers? 您正在使用双精度来存储分数,不是仅整数吗?

double scores[N_OF_TEAMS];

To read those values you can use: 要读取这些值,可以使用:

int read_name_and_score( char * fname, int m, char nn[][MAX_CHAR_IN_NAME], double * ss)
{
    FILE *pf;
    int count = 0;

    if (!fname) {
        prinf("Error, no file name.\n");
        return -1;
    }
    pf = fopen(fname,'r');
    if (!pf) {
        printf("An error occurred while opening file %s.\n",fname);
        return -2;
    }

    while ( count < m && fscanf(pf, "%[^,],%d\n", nn[count], &ss[count]) == 2 ) count++;

    if (!fclose(pf)) {
        printf("An error occurred while closing file %s.\n",fname);
    };
    return count;
}

You need the [^,] to stop scanf from reading the string when finds a , The main will be something like: 您需要[^,]来阻止scanf在找到时读取字符串,主要内容如下:

#define N_OF_TEAMS 32
#define MAX_CHAR_IN_NAME 30

int main(void) {
    char team_names[N_OF_TEAMS][MAX_CHAR_IN_NAME];
    double scores[N_OF_TEAMS];
    int n;

    n = read_name_and_score("data.csv",N_OF_TEAMS,team_names,scores);
    if ( n != N_OF_TEAMS) {
        printf("Error, not enough data was read.\n");
        /* It's up to you to decide what to do now */
    }

    /* do whatever you want with data */

    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM