在C中读取具有不同数据类型的多行

Question

我有一个非常奇怪的问题，我正在尝试使用C读取.txt文件，并且数据的结构如下： %s %s %d %d因为我必须一直读取字符串all the way to \\n我正在这样阅读：

while(!feof(file)){
        fgets(s[i].title,MAX_TITLE,file);
        fgets(s[i].artist,MAX_ARTIST,file);
        char a[10];
        fgets(a,10,file);
        sscanf(a,"%d %d",&s[i].time.min,&s[i++].time.sec);
    }

但是，我在s.time.min中读取的very first整数显示一个随机大数。

我现在正在使用sscanf，因为有些人有类似的问题，但这没有帮助。

谢谢！

编辑：整数表示时间，它们将永远不会超过5个字符的总和，包括它们之间的空格。

Answer 1

注意，我认为您的帖子是从3条不同的行中读取值，例如：

%s
%s
%d %d

（主要由使用fgets证明， fgets是一种面向行的输入函数，每次调用时都会读取一行输入（最多包含 '\\n' 。）如果不是这种情况，则执行以下操作不适用（并且可以大大简化）

由于您正在将多个值读取到结构数组中的单个元素中，因此，在开始将信息复制到结构成员本身之前，使用临时值读取每个值并验证每个值可能会更好（并且更健壮）。 这使您能够（1）验证所有值的读取，以及（2）验证所有必需值的解析或转换，然后再将成员存储在结构中并增加数组索引。

另外，您将需要从title和artist删除尾部的'\\n' ，以防止嵌入的换行符悬挂在字符串的末尾（这会在搜索title或artist时造成严重破坏）。 例如，将它们放在一起，您可以执行以下操作：

void rmlf (char *s);
....
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST = "";
char a[10] = "";
int min, sec;
...
while (fgets (title, MAX_TITLE, file) &&     /* validate read of values */
       fgets (artist, MAX_ARTIST, file) &&
       fgets (a, 10, file)) {

    if (sscanf (a, "%d %d", &min, &sec) != 2) {  /* validate conversion */
        fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
        continue;  /* skip line - tailor to your needs */
    }

    rmlf (title);   /* remove trailing newline */
    rmlf (artist);

    s[i].time.min = min;    /* copy to struct members & increment index */
    s[i].time.sec = sec;
    strncpy (s[i].title, title, MAX_TITLE);
    strncpy (s[i++].artist, artist, MAX_ARTIST);
}

/** remove tailing newline from 's'. */
void rmlf (char *s)
{
    if (!s || !*s) return;
    for (; *s && *s != '\n'; s++) {}
    *s = 0;
}

（ 注意：这还将读取所有值，直到遇到不使用feof的EOF为止（请参阅相关链接： 为什么“ while（！feof（file））”总是错误的？ ）

使用 fgets 防止短读

乔纳森的评论继，使用时fgets你应该检查，以确保您确实阅读了整条生产线，并没有经历过很短的阅读 ，你提供的最大字符值不足以读取整行（例如短阅读 ，因为该行中的字符保持未读状态）

如果发生短读，这将完全破坏您从文件中读取任何其他行的能力，除非您正确处理了故障。 这是因为下一次读取尝试不会在您认为正在读取的行上开始读取，而是尝试读取发生短读取的行的其余字符。

您可以通过验证读取到缓冲区的最后一个字符实际上是'\\n'字符来验证fgets的读取。 （如果该行长于您指定的最大值，则在nul终止字符之前的最后一个字符将改为普通字符。）如果遇到短读，则必须在长行之前读取并丢弃其余字符继续您的下一次阅读。 （除非您使用的是动态分配的缓冲区，你可以简单realloc根据需要读取行的剩余部分，以及你的数据结构）

您的情况使每个struct元素需要输入文件中3行中的数据，使验证变得复杂。 您必须始终保持3行读取同步，以在读取循环的每次迭代期间将所有3行作为一组读取（即使发生短读取）。 这意味着您必须验证是否已读取所有3行并且没有发生短读操作，以便在不退出输入循环的情况下处理任何一个短读操作 。 （如果您只想在任意一次短读时终止输入，则可以分别验证每个输入，但这会导致输入例程非常不灵活。

除了从输入中删除尾随的换行符之外，您还可以将上面的rmlf函数调整为一个函数，该函数可以验证fgets每次读取。 我在下面的一个名为shortread的函数中完成了此shortread 。 可以对原始功能和读取循环进行调整，如下所示：

int shortread (char *s, FILE *fp);
...
    for (idx = 0; idx < MAX_SONGS;) {

        int t, a, b;
        t = a = b = 0;

        /* validate fgets read of complete line */
        if (!fgets (title, MAX_TITLE, fp)) break;
        t = shortread (title, fp);

        if (!fgets (artist, MAX_ARTIST, fp)) break;
        a = shortread (artist, fp);

        if (!fgets (buf, MAX_MINSEC, fp)) break;
        b = shortread (buf, fp);

        if (t || a || b) continue;  /* if any shortread, skip */

        if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
            fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
            continue;  /* skip line - tailor to your needs */
        }

        s[idx].time.min = min;   /* copy to struct members & increment index */
        s[idx].time.sec = sec;
        strncpy (s[idx].title, title, MAX_TITLE);
        strncpy (s[idx].artist, artist, MAX_ARTIST);
        idx++;
    }
...
/** validate complete line read, remove tailing newline from 's'.
 *  returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
 *  if shortread, read/discard remainder of long line.
 */
int shortread (char *s, FILE *fp)
{
    if (!s || !*s) return -1;
    for (; *s && *s != '\n'; s++) {}
    if (*s != '\n') {
        int c;
        while ((c = fgetc (fp)) != '\n' && c != EOF) {}
        return 1;
    }
    *s = 0;
    return 0;
}

（ 请注意：在上面的示例中，对组成标题，艺术家，时间组的每一行进行了shortread检查的结果。）

为了验证该方法，我整理了一个简短的示例，该示例将有助于将所有内容放在上下文中。 查看示例，让我知道是否还有其他问题。

 #include <stdio.h>
#include <string.h>

/* constant definitions */
enum { MAX_MINSEC = 10, MAX_ARTIST = 32, MAX_TITLE = 48, MAX_SONGS = 64 };

typedef struct {
    int min;
    int sec;
} stime;

typedef struct {
    char title[MAX_TITLE];
    char artist[MAX_ARTIST];
    stime time;
} songs;

int shortread (char *s, FILE *fp);

int main (int argc, char **argv) {

    char title[MAX_TITLE] = "";
    char artist[MAX_ARTIST] = "";
    char buf[MAX_MINSEC] = "";
    int  i, idx, min, sec;
    songs s[MAX_SONGS] = {{ .title = "", .artist = "" }};
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    for (idx = 0; idx < MAX_SONGS;) {

        int t, a, b;
        t = a = b = 0;

        /* validate fgets read of complete line */
        if (!fgets (title, MAX_TITLE, fp)) break;
        t = shortread (title, fp);

        if (!fgets (artist, MAX_ARTIST, fp)) break;
        a = shortread (artist, fp);

        if (!fgets (buf, MAX_MINSEC, fp)) break;
        b = shortread (buf, fp);

        if (t || a || b) continue;  /* if any shortread, skip */

        if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
            fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
            continue;  /* skip line - tailor to your needs */
        }

        s[idx].time.min = min;   /* copy to struct members & increment index */
        s[idx].time.sec = sec;
        strncpy (s[idx].title, title, MAX_TITLE);
        strncpy (s[idx].artist, artist, MAX_ARTIST);
        idx++;
    }
    if (fp != stdin) fclose (fp);   /* close file if not stdin */

    for (i = 0; i < idx; i++)
        printf (" %2d:%2d  %-32s  %s\n", s[i].time.min, s[i].time.sec, 
                s[i].artist, s[i].title);

    return 0;
}

/** validate complete line read, remove tailing newline from 's'.
 *  returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
 *  if shortread, read/discard remainder of long line.
 */
int shortread (char *s, FILE *fp)
{
    if (!s || !*s) return -1;
    for (; *s && *s != '\n'; s++) {}
    if (*s != '\n') {
        int c;
        while ((c = fgetc (fp)) != '\n' && c != EOF) {}
        return 1;
    }
    *s = 0;
    return 0;
}

输入示例

$ cat ../dat/titleartist.txt
First Title I Like
First Artist I Like
3 40
Second Title That Is Way Way Too Long To Fit In MAX_TITLE Characters
Second Artist is Fine
12 43
Third Title is Fine
Third Artist is Way Way Too Long To Fit in MAX_ARTIST
3 23
Fourth Title is Good
Fourth Artist is Good
32274 558212 (too long for MAX_MINSEC)
Fifth Title is Good
Fifth Artist is Good
4 27

使用/输出示例

$ ./bin/titleartist <../dat/titleartist.txt
  3:40  First Artist I Like               First Title I Like
  4:27  Fifth Artist is Good              Fifth Title is Good

Answer 2

我将使用strtok（）和atoi（）代替sscanf（）。

只是好奇，为什么两个整数只有10个字节？ 您确定它们总是那么小吗？

顺便说一句，我很抱歉给你一个简短的答案。 我敢肯定有一种方法可以使sscanf（）为您工作，但是根据我的经验，sscanf（）可能有点挑剔，所以我不怎么喜欢。 当用C解析输入时，我发现用strtok（）标记输入并使用不同的ato分别转换每个片段会更有效（就编写和调试代码而言需要多长时间）？ 函数（atoi，atof，atol，strtod等；请参见stdlib.h）。 它使事情变得更简单，因为每个输入项都是单独处理的，这使调试任何问题（如果出现的话）变得更加容易。 最后，与过去尝试使用sscanf（）相比，我通常花更少的时间使这些代码可靠地工作。

Answer 3

使用"%*s %*s %d %d"作为格式字符串，而不是...

您似乎期望sscanf自动跳过通向小数位数字段的两个标记。 除非您明确告诉它，否则它不会这样做（因此，一对%*s ）。

您不能指望设计C的人以与您相同的方式设计它。 正如iharob所说，您需要检查返回值。

那不是全部。 您需要阅读（并相对理解）整个scanf手册（由OpenGroup编写的手册是可以的）。 这样，您就知道如何使用该函数（包括格式字符串的所有细微差别）以及如何处理返回值。

作为程序员， 您需要阅读 。 记住这一点。

在C中读取具有不同数据类型的多行

问题描述

3 个解决方案

解决方案1
2 已采纳 2016-05-19 02:37:02

解决方案2
0 2016-05-18 20:37:14

解决方案3
0 2016-05-19 03:07:31

在C中读取具有不同数据类型的多行

问题描述

3 个解决方案

解决方案1 2 已采纳 2016-05-19 02:37:02

解决方案2 0 2016-05-18 20:37:14

解决方案3 0 2016-05-19 03:07:31

解决方案1
2 已采纳 2016-05-19 02:37:02

解决方案2
0 2016-05-18 20:37:14

解决方案3
0 2016-05-19 03:07:31