[英]Capture quoted strings separated with commas from a file
let's say I want to take an input from a file like this :-假设我想从这样的文件中获取输入:-
"8313515769001870,GRKLK,03/2023,eatcp,btlzg"
"6144115684794523,ZEATL,10/2033,arnne,drrfd"
for a structure I made as follows对于我制作的结构如下
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
This is my attempt to read the input from a file named 'file' in the reading mode, the str in fgets is storing the right string but it isn't getting absorbed c[i]:这是我在读取模式下从名为“file”的文件中读取输入的尝试,fgets 中的 str 存储了正确的字符串,但它没有被 c[i] 吸收:
FILE * fptr;
int count=0;
fptr= fopen("file","r");
Card *c = (Card*)calloc(10,sizeof(Card));
printf("StartAlloc\n");
int i=0;
char str[1000];
fgets(str,80,fptr);
if(fptr==NULL)
{return 0;}
do{
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^,]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
i++;
}while(fgets(str,80,fptr)!=NULL);
I do not understand why the regex %[^,] is not capturing the individual elements, I have wasted a lot of time, and help would be greatly appreciated.我不明白为什么正则表达式 %[^,] 没有捕获单个元素,我浪费了很多时间,非常感谢帮助。
The last token doesn't end with a ','
, so you can't use %[^,]
for it.最后一个标记不以
','
结尾,因此您不能使用%[^,]
。 It is however followed by a '\\"'
, so you can use %[^\\"]
instead :然而,它后面跟着一个
'\\"'
,所以你可以使用%[^\\"]
代替:
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
If you just need to read from the file, you could just use fscanf()
instead of reading from file to a character array and then use sscanf()
for that string.如果您只需要从文件中读取,您可以只使用
fscanf()
而不是从文件读取到字符数组,然后对那个字符串使用sscanf()
。
And you needn't explicitly type cast the return value of calloc()
.并且您不需要显式地对
calloc()
的返回值进行类型转换。 See is it necessary to type-cast malloc and calloc .查看是否有必要对 malloc 和 calloc 进行类型转换。
You are doing你在做
if(fptr==NULL)
{return 0;}
after you tried to read from the file.在您尝试从文件中读取之后。 If the file couldn't be opened the program would crash well before the control reaches this
if
statement.如果无法打开文件,则程序将在控件到达此
if
语句之前崩溃。
Place this check right after opening the file like打开文件后立即进行此检查,例如
FILE *fptr = fopen("file", "r");
if(fptr==NULL)
{
return EXIT_FAILURE;
}
and return value 0
is usually taken to mean success.返回值
0
通常表示成功。 Since input file not being found is an error, try returning EXIT_FAILURE
instead.由于未找到输入文件是一个错误,请尝试返回
EXIT_FAILURE
。
And in the last %[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last
"` is found.而在
%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last
的最后一个%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last
找到%[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last
"`。
Also, at the end of the format string, there's a space followed by a \\n
.此外,在格式字符串的末尾,有一个空格,后跟一个
\\n
。 The \\n
is redundant here as a space will match " One white-space character in format-string matches any combination of white-space characters in the input " \\n
在这里是多余的,因为空格将匹配“ 格式字符串中的一个空白字符匹配输入中空白字符的任意组合”
So the final format string could be所以最终的格式字符串可能是
"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" "
And don't forget to close the files you've opened and free the memory you've allocated before the end of the program like并且不要忘记关闭您打开的文件并释放您在程序结束之前分配的内存,例如
free(c); //for the Card pointer
fclose(fptr);
Using fscanf()
with the proper format you can retrieve the desired elements from each line :使用具有正确格式的
fscanf()
您可以从每一行中检索所需的元素:
"\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n"
With the previous format, the opening quote is ignored (
\\"
), and the strings separated by commas are captured (%[^,]%*c
). Finally the the closing quote is discarded (%[^\\"]%*c
), and the line break considered (\\n
), to let next line to be read.使用之前的格式,开头的引号被忽略(
\\"
),并且以逗号分隔的字符串被捕获(%[^,]%*c
)。最后结尾的引号被丢弃(%[^\\"]%*c
) 和考虑的换行符 (\\n
),让下一行被读取。
This is how you can integrate it in your code :这是将它集成到代码中的方法:
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) i++;
Complete code snippet for testing purposes :用于测试目的的完整代码片段:
#include <stdio.h>
#include <stdlib.h>
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
int main(){
FILE *file;
file = fopen("data.csv", "r");
int i=0;
Card *c = (Card*)calloc(10,sizeof(Card));
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) {
printf("%s | %s | %s | %s | %s \n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name);
i++;
}
fclose(file);
return 0;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.