简体   繁体   English

从文件中捕获用逗号分隔的带引号的字符串

[英]Capture quoted strings separated with commas from a file

let's say I want to take an input from a file like this :-假设我想从这样的文件中获取输入:-

"8313515769001870,GRKLK,03/2023,eatcp,btlzg"
"6144115684794523,ZEATL,10/2033,arnne,drrfd"

for a structure I made as follows对于我制作的结构如下

typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;

This is my attempt to read the input from a file named 'file' in the reading mode, the str in fgets is storing the right string but it isn't getting absorbed c[i]:这是我在读取模式下从名为“file”的文件中读取输入的尝试,fgets 中的 str 存储了正确的字符串,但它没有被 c[i] 吸收:

FILE * fptr;
int count=0;
fptr= fopen("file","r");
Card *c = (Card*)calloc(10,sizeof(Card));
printf("StartAlloc\n");
int i=0;
char str[1000];
fgets(str,80,fptr);
if(fptr==NULL)
{return 0;}
do{
     sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^,]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
i++;

}while(fgets(str,80,fptr)!=NULL);

I do not understand why the regex %[^,] is not capturing the individual elements, I have wasted a lot of time, and help would be greatly appreciated.我不明白为什么正则表达式 %[^,] 没有捕获单个元素,我浪费了很多时间,非常感谢帮助。

The last token doesn't end with a ',' , so you can't use %[^,] for it.最后一个标记不以','结尾,因此您不能使用%[^,] It is however followed by a '\\"' , so you can use %[^\\"] instead :然而,它后面跟着一个'\\"' ,所以你可以使用%[^\\"]代替:

sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);

If you just need to read from the file, you could just use fscanf() instead of reading from file to a character array and then use sscanf() for that string.如果您只需要从文件中读取,您可以只使用fscanf()而不是从文件读取到字符数组,然后对那个字符串使用sscanf()

And you needn't explicitly type cast the return value of calloc() .并且您不需要显式地对calloc()的返回值进行类型转换。 See is it necessary to type-cast malloc and calloc .查看是否有必要对 malloc 和 calloc 进行类型转换


You are doing你在做

if(fptr==NULL)
{return 0;}

after you tried to read from the file.您尝试从文件中读取之后 If the file couldn't be opened the program would crash well before the control reaches this if statement.如果无法打开文件,则程序将在控件到达此if语句之前崩溃。

Place this check right after opening the file like打开文件后立即进行此检查,例如

FILE *fptr = fopen("file", "r");
if(fptr==NULL)
{
    return EXIT_FAILURE;
}

and return value 0 is usually taken to mean success.返回值0通常表示成功。 Since input file not being found is an error, try returning EXIT_FAILURE instead.由于未找到输入文件是一个错误,请尝试返回EXIT_FAILURE


And in the last %[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last "` is found.而在%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last的最后一个%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last找到%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last "`。

Also, at the end of the format string, there's a space followed by a \\n .此外,在格式字符串的末尾,有一个空格,后跟一个\\n The \\n is redundant here as a space will match " One white-space character in format-string matches any combination of white-space characters in the input " \\n在这里是多余的,因为空格将匹配“ 格式字符串中的一个空白字符匹配输入中空白字符的任意组合

So the final format string could be所以最终的格式字符串可能是

"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" "

And don't forget to close the files you've opened and free the memory you've allocated before the end of the program like并且不要忘记关闭您打开的文件并释放您在程序结束之前分配的内存,例如

free(c); //for the Card pointer
fclose(fptr);

Using fscanf() with the proper format you can retrieve the desired elements from each line :使用具有正确格式的fscanf()您可以从每一行中检索所需的元素:

"\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n" 

With the previous format, the opening quote is ignored ( \\" ), and the strings separated by commas are captured ( %[^,]%*c ). Finally the the closing quote is discarded ( %[^\\"]%*c ), and the line break considered ( \\n ), to let next line to be read.使用之前的格式,开头的引号被忽略( \\" ),并且以逗号分隔的字符串被捕获( %[^,]%*c )。最后结尾的引号被丢弃( %[^\\"]%*c ) 和考虑的换行符 ( \\n ),让下一行被读取。

This is how you can integrate it in your code :这是将它集成到代码中的方法:

while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) i++;

Complete code snippet for testing purposes :用于测试目的的完整代码片段:

#include <stdio.h>
#include <stdlib.h>

typedef struct{
    char Card_Number[20];
    char Bank_Code[6];
    char Expiry_Date[8];
    char First_Name[30];
    char Last_Name[30];
}Card;

int main(){
    FILE *file;
    file = fopen("data.csv", "r");
    int i=0;
    Card *c = (Card*)calloc(10,sizeof(Card));

    while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) {
        printf("%s | %s | %s | %s | %s \n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name);
        i++;
    }
    fclose(file);
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从文件中读取以逗号分隔的“引用”字符串 - Read comma-separated “quoted” strings from a file 如何使用fscanf在C中捕获由制表符分隔的文件中的字符串 - How can I capture with fscanf, in C, strings from a file separated by tabulators 如何在C中用逗号分隔的字符串中扫描数字 - How to scan for numbers in strings separated by commas in c 我试图将一个 txt 文件读入一个链表,其中包含一行中的整数和字符串,用逗号分隔 - Im trying to read a txt file into a linked list, that contains both integers and strings in a line, separated by commas 尝试从文件中读取由“/”分隔的字符串 - Trying to read strings separated by a '/' from a file C读取由逗号分隔的数字文件 - C Reading a file of digits separated by commas PSTR如何接收多个不以逗号分隔的字符串? - How can PSTR receive multiple strings not separated by commas? 我如何用一个用逗号分隔的文件中的双打填充数组 - how do i fill my array with doubles from a file that is separated by a commas C - 将逗号分隔的字符串从文件中提取到数组中 - 分段错误 - C - extract comma separated strings from file into array - segmantation fault 从 C 中的 txt 文件中提取逗号分隔的字符串 - extract comma separated strings from txt file in C
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM