I am writing a program that finds the number of occurrences of input substrings from the command line inside a text file (also read from the command line) which is written into a buffer.
When I run the code in bash, I get the error: Segmentation fault (core dumped). I am still learning how to code with C in this environment and have some sort of idea as to why the segmentation fault occurred (misuse of dynamic memory allocation?), but I could not find the problem with it. All I could conclude was that the problem is coming from within the for loop (I labeled where the potential error is being caused in the code).
EDIT: I managed to fix the segmentation fault error by changing argv[j]
to argv[i]
, however when I run the code now, count1 always returns 0 even if the substring occurs multiple times in the text file and I am not sure what is wrong even though I have gone through the code multiple times.
$ more foo.txt
aabbccc
$ ./main foo.txt a
0
#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
int main(int argc, char *argv[]) {
FILE *fp;
long lsize;
char *buf;
int count = 0, count1 = 0;
int i, j, k, l1, l2;
if (argc < 3) { printf("Error: insufficient arguments.\n"); return(1); };
fp = fopen(argv[1], "r");
if (!fp) {
perror(argv[1]);
exit(1);
}
//get size of file
fseek(fp, 0L, SEEK_END);
lsize = ftell(fp);
rewind(fp);
//allocate memory for entire content
buf = calloc(1, lsize+1);
if (!buf) {
fclose(fp);
fputs("Memory alloc fails.\n", stderr);
exit(1);
}
//copy the file into the buffer
if (1 != fread(buf, lsize, 1, fp)) {
fclose(fp);
free(buf);
fputs("Entire read fails.\n", stderr);
exit(1);
}
l1 = strlen(buf);
//error is somewhere here
for (i = 2; i < argc; i++) {
for (j = 0; j < l1;) {
k = 0;
count = 0;
while ((&buf[j] == argv[k])) {
count++;
j++;
k++;
}
if (count == strlen(argv[j])) {
count1++;
count = 0;
}
else
j++;
}
printf("%d\n", count1);
}
fclose(fp);
return 0;
}
fread(buf, lsize, 1, fp)
will read 1 block of lsize
bytes, however fread
doesn't care about the contents and won't add a '\\0'
-terminating byte for the string, so l1 = strlen(buf);
yields undefined behaviour, the rest of the result can be ignored as a result of this (and your counting has errors as well). Note that files usually don't have a 0-terminating byte at the end, that applies even for files containing text, they usually end with a newline.
You have to set the 0-terminating byte yourself:
if (1 != fread(buf, lsize, 1, fp)) {
fclose(fp);
free(buf);
fputs("Entire read fails.\n", stderr);
exit(1);
}
buf[lsize] = '0';
And you can use strstr
to get the location of the substring, like this:
for(i = 2; i < argc; ++i)
{
char *content = buf;
int count = 0;
while((content = strstr(content, argv[i])))
{
count++;
content++; // point to the next char in the substring
}
printf("The substring '%s' appears %d time(s)\n", argv[i], count);
}
Your counting is wrong, there are some errors. This comparison
&buf[j] == argv[k]
is wrong, you are comparing pointers, not the contents. You have to use strcmp
to compare strings. In this case you would have to use strncmp
because you only want to match the substring:
while(strncmp(&buf[j], argv[k], strlen(argv[k])) == 0)
{
// substring matched
}
but this is also wrong, because you are incrementing k
as well, which will give you the next argument, at the end you might read beyond the limits of argv
if the substring is longer than the number of arguments. Based on your code, you would have to compare characters:
while(buf[j] == argv[i][k])
{
j++;
k++;
}
You would have to increment the counter
only when a substring is matched, like this:
l1 = strlen(buf);
for (i = 2; i < argc; i++) {
int count = 0;
int k = 0; // running index for inspecting argv[i]
for (j = 0; j < l1; ++j) {
while(buf[j + k] == argv[i][k])
k++;
// if all characters of argv[i]
// matched, argv[i][k] will be the
// 0-terminating byte
if(argv[i][k] == 0)
count++;
// reset running index for argv[i]
// go to next char if buf
k = 0;
}
printf("The substring '%s' appears %d time(s)\n", argv[i], count);
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.