简体   繁体   中英

How to use sscanf to store variables in a line from a file?

I want to scan and store the variables in this line: 11.0.0.0, 255.0.0.0, 10.1.0.1, eth9 as netId , netMask , gateway and interface

Using sscanf(buff1,"%s %s %s %s",netId,netMask,Gateway,Iface); I'm able to store these variables but how can I store these variables when there is a comma( , ) as mentioned in the above example?

You have to identify what you want carefully. It is harder than you'd like, but it can be done. The trouble with %s is that it reads up to the first white space character. The comma is not white space, so it will be included in the string scanned by %s , and then there isn't a comma left in the input to match the comma in the format string. So, you need to look for a sequence of 'not commas'. That's a 'scan set'.

if (sscanf(buff1," %[^,], %[^,], %[^,], %s", netId, netMask, Gateway, Iface) != 4)
    …data was malformed…

The leading white space in the format skips optional leading spaces in the input string, like %s would skip leading white space.


As Zack notes in a comment , this code does not protect you from buffer overflows. Since you didn't show the definitions of any of the variables, it is not possible to know whether this is an issue or not. If you have:

char buff1[64];
char netId[64];
char netMask[64];
char Gateway[64];
char Iface[64];

then clearly none of the individual fields can be larger than the input buffer and overflow is not possible. OTOH, if the individual fields are smaller than the buffer, Zack is right that you could overflow the buffers.

There are (at least) two ways to avoid that problem. First, assuming each of the target buffers is 16 bytes long (instead of 64 as shown above), then this modified code would be safe:

if (sscanf(buff1," %15[^,], %15[^,], %15[^,], %15s",
           netId, netMask, Gateway, Iface) != 4)
    …data was malformed…

This could still leave some bytes at the end of the buffer after the Iface element, but is otherwise safe. Note that the size specified in the conversion specification is one less than the size in the data definition; this allows for the null terminator.

The alternative uses a POSIX sscanf() feature: the m 'assignment allocation' modifier. In this case, you pass a pointer to a char * to scanf() and it allocates the correct amount of memory:

char *netId = 0;
char *netMask = 0;
char *Gateway = 0;
char *Iface = 0;

if (sscanf(buff1," %m[^,], %m[^,], %m[^,], %ms",
           &netId, &netMask, &Gateway, &Iface) != 4)
    …data was malformed…

free(netId);
free(netMask);
free(Gateway);
free(Iface);

Note that if a conversion fails, all memory allocated by the m modifier is freed before sscanf() returns. However, it is not guaranteed that if the third allocation fails, the pointers for the first and second allocations are unchanged. Thus, you should not free any of the allocated memory if the overall conversion fails.

You should not do this with sscanf , because you should never use *scanf for anything. There are several reasons for this; the immediately relevant ones are that it's impossible to do error recovery reliably with *scanf , and the %s and %[...] format descriptors can be used without specifying the size of the destination buffer, making them just as dangerous as the infamous gets .

I would personally do this with hand-rolled code of the general form

char *p = buf, *q;
for (q = p; *q && *q != ','; q++) {}
if (!*q) syntax_error();
*q = '\0';
netId = strdup(p);

p = q+1;
while (*p == ' ' || *p == '\t') p++; 
for (q = p; *q && *q != ','; q++) {}
if (!*q) syntax_error();
*q = '\0';
netMask = strdup(p);

// etc

There are functions in the standard library (eg strsep and strchr ) that seem like they can improve on the above, but if you actually try to use them you discover that they don't make your code any shorter or easier to read.

On a POSIX system, another reasonable option is the regex.h interfaces:

// ERROR HANDLING OMITTED FOR BREVITY

// outside the loop
regex_t linere;
regcomp(&linere, 
        "^([0-9.]+),[ \t]*([0-9.]+),[ \t]*([0-9.]+),[ \t]*([a-zA-Z0-9_]+)$",
        REG_EXTENDED);

// inside the loop
regmatch_t rm[5];
regexec(&linere, buf, 5, rm, 0);

netId = malloc(rm[1].rm_eo - rm[1].rm_so + 1);
memcpy(netId, buf + rm[1].rm_so, rm[1].rm_eo - rm[1].rm_so);
netId[rm[1].rm_eo - rm[1].rm_so] = '\0';

// etc

If the parsing job is even a little tiny bit more complicated than this it may be time to reach for lex and yacc .

You need to use %[^,] on your format string to specify the string to copy until ',' .

Exactly like:

sscanf(buff1,"%[^,], %[^,], %[^,], %[^,]", netId, netMask, Gateway, Iface);

EDIT1: Thanks to Jonathan's comment ',' is changed to [^,] in format string.

you simple give the characters you wanna ignore such as comma like this

sscanf(buff1,"%s,%s,%s,%s",netId,netMask,Gateway,Iface); 

to ignore them (not read) , scanf and sscanf both look for a exact match of whatever you give within the quotes , For Example if you try to read a String as

char str[20];
 scanf("hi%s",str);

you have to enter the input as ' himystring ', what gets stored in str will be ' mystring ',

Hope that Clears it for you !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM