简体   繁体   中英

fscanf format specifier - columnwise numeric data ending with or without a comment string

I have a datafile with say 3 columns containing float numbers (assume number columns are fixed for now) but the last column is a string of characters that contains spaces (but in a single line).

The problem am facing is not all lines of text file may contain the last comment string, otherwise using

fscanf(<file pointer>, %f %f %f %[^\\n]%*c,&var1,&var2,&var3,temp) works

as below: (for 3 rows of data)

FILE *fp1=NULL;
printf("HELLO\n");
double value1 = 0.00,value2 = 0.00,value3=0.00;
int i=0;
char temp[200];

fp1 = fopen(textfile,"r");
rewind(fp1);
if(fp1 == 0)    { 
    perror("ERROR OPENING FILE ");
    return ;
}
char format[] = "%lf %lf %lf %[^\n]s%*c";
i=3;
while(i--)
{
    fscanf(fp1,format,&value1,&value2,&value3,temp) ;
    printf("%lf,%lf,%lf - ",value1,value2,value3);
    printf("%s\n",temp);
}
fclose(fp1);

Above code works for if each row contains a comment as last column, but lines are getting combined if the comment is not there at the end of row.

For example, it fails for file with data below:

1.00 1.1 1.4 //this is first line 
2.00 2.1 2.4 
4.00 4.1 4.4 //this is fourth line
3.00 3.1 3.4
5.00 5.1 5.4 //this is fifth line

And gives output as:

HELLO
1.000000,1.100000,1.400000 - //this is first line
2.000000,2.100000,2.400000 - 4.00 4.1 4.4 //this is fourth line
3.000000,3.100000,3.400000 - 5.00 5.1 5.4 //this is fifth line

Hope my issue is clear.

Well, of course it fails, since you explicitly ask for three doubles separated by spaces, then another space, then some arbitrarily long text, then another character( newline ).

You need to make the comment optional. Use a combination of fscanf() and fgets() :

char format[] = "%lf %lf %lf";

fscanf(fp1, format, &value1, &value2, &value3);

if (fgets(temp, 200, fp1)) {
    temp[strcspn(temp, "\n")] = '\0';
}

The problem is subtle. In a scanf() -family format string, a space indicates 0 or more white space characters, which means blanks, tabs or newlines. Also, all format specifiers except %c , %[…] and %n skip optional leading white space. And the functions do not care in the slightest about newlines in the white space (newlines only matter in scan-sets). If you need line oriented input, then you should read the lines with fgets() or POSIX's getline() and then process the string with sscanf() .

Your format string is:

char format[] = "%lf %lf %lf %[^\n]s%*c";

When the code reaches the data lines:

2.00 2.1 2.4 
4.00 4.1 4.4 //this is fourth line

the format reads the three numbers 2.00 , 2.1 and 2.4 ; it then scans for some white space, reading the newline, and finds the 4 of 4.00 , then processes the scan-set up to the newline, and then reads and discards the newline with %*c .

If you wanted to stick with scanf() or fscanf() regardless, part of the fix would be to remove the space before the scan-set. You would also need to analyze the return value from fscanf() , more or less as shown below. Note that when there is no comment, the newline would be left in the input buffer, but the next fscanf() would skip the newline before next number.

However, you are best off moving to line-based input. When you use sscanf() instead of the direct file scanning functions, you can leave the space in the format string. Part of the fix is to check how many values were successfully scanned. For ease of comprehension of the output, I surround the comment material with square brackets in the format string.

Extracting your code into something resembling an MCVE ( Minimal, Complete, Verifiable Example ) and reading from standard input instead of a file stream that you open, you can end up with code like this:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char line[4096];

    while (fgets(line, sizeof(line), stdin) != 0)
    {
        char temp[200];
        double value1 = 0.00, value2 = 0.00, value3 = 0.00;
        char format[] = "%lf %lf %lf %[^\n]s%*c";
        int nf = sscanf(line, format, &value1, &value2, &value3, temp);
        if (nf == 4)
            printf("Comment: %lf,%lf,%lf - [%s]\n", value1, value2, value3, temp);
        else if (nf == 3)
            printf("Plain:   %lf,%lf,%lf\n", value1, value2, value3);
        else
        {
            line[strcspn(line, "\n")] = '\0';
            printf("Invalid: [%s]\n", line);
        }
    }
    return 0;
}

Example run:

Comment: 1.000000,1.100000,1.400000 - [//this is first line ]
Plain:   2.000000,2.100000,2.400000
Comment: 4.000000,4.100000,4.400000 - [//this is fourth line]
Plain:   3.000000,3.100000,3.400000
Comment: 5.000000,5.100000,5.400000 - [//this is fifth line]

Interestingly, the data copied from the question has a blank at the end of the second line. An earlier version of the code omitted the space before the scan-set in the format string. The second line then showed up as a comment line, not a plain line, and all the comments included the leading space. Beware: scanf() format strings are incredibly difficult to master.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM