I want to know specifically what the difference in speed is between fgets() and getc() - primarily for big amounts of data. I picked getc() because in the other threads someone said that it is faster than fgetc() because it can be run as a macro. Analogously, gets() is discouraged and deprecated because it has no buffer limits like fgets().
I made a little example program in hopes that someone knows how to measure time and resources used by the two alternatives. The example needs a file with an amount of characters corresponding to CHARS_PER_LINE * LINES. If you run the program without any arguments, it will try to copy the file into memory with getc(), otherwise if any amount of arguments is passed, it will run the fgets() version.
#include <stdio.h>
#include <stdlib.h>
#define CHARS_PER_LINE 2000
#define LINES 100
int main (int argc, char *argv[]) {
// DECLARE VARS
char **data;
FILE *fp;
// ALLOCATE 2D ARRAY MEMORY
data = malloc(LINES * sizeof(char*));
for (int i = 0; i < CHARS_PER_LINE; i++) {
data[i] = malloc(CHARS_PER_LINE * sizeof(char));
}
// OPEN FILE FOR READING
fp = fopen ("file.txt", "r");
// COPY CHARS WITH GETC()
if (argc == 1) { // if no arguments - getc
for (int i = 0; i < LINES; i++) {
for (int ii = 0; ii < CHARS_PER_LINE; ii++) {
data[i][ii] = getc(fp);
}
}
}
// COPY CHARS WITH FGETS()
else { // if any amount of arguments passed - fgets
for (int i = 0; i < LINES; i++) {
fgets(data[i], (CHARS_PER_LINE + 1), fp);
// does fgets not have a buffer limit?
}
}
// CLOSE FILE
fclose(fp);
return(0);
}
I am aware that this is bad and insecure and a lot of assumptions are made, but I tried to focus on making the code similar so it can be tested fairly. Logically, the hypothesis is that getc will outperform fgets for low number of chars fetched per iteration, and fgets will outperform getc for large amounts of data. The question still remains - by how much and at what rate?
In hopes that we can bring this eternal question to a conclusion with some hard numbers, please help
fgets
will read the file until a null character is read (or the buffer is full). getc
will read only a single byte. They do different things so not really comparable.
Ok, I decided to help myself, so I went and figured out a way to benchmark, which I bet will now get more criticism than I got help for the initial problem.
Here is the full code. You can get files with random chars by searching "random file generator".
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define CHARS_PER_LINE 2000
#define LINES 100
int main (int argc, char *argv[]) {
// DECLARE VARS
char **data;
FILE *fp1;
FILE *fp2;
clock_t time_before, time_after;
double time1, time2;
char check1, check2;
// ALLOCATE 2D ARRAY MEMORY
data = malloc(LINES * sizeof(char*));
for (int i = 0; i < CHARS_PER_LINE; i++) {
data[i] = malloc(CHARS_PER_LINE * sizeof(char));
}
if (argc == 3) {
// OPEN FILE 1 FOR READING
fp1 = fopen (argv[1], "r");
// COPY CHARS WITH GETC()
time_before = clock();
for (int i = 0; i < LINES; i++) {
for (int ii = 0; ii < CHARS_PER_LINE; ii++) {
data[i][ii] = getc(fp1);
}
}
time_after = clock();
time1 = (double)(time_after - time_before);
check1 = data[50][1495];
// CLOSE FILE 1
fclose(fp1);
// OPEN FILE 2 FOR READING
fp2 = fopen (argv[2], "r");
// COPY CHARS WITH FGETS()
time_before = clock();
for (int i = 0; i < LINES; i++) {
fgets(data[i], CHARS_PER_LINE, fp2);
}
time_after = clock();
time2 = (double)(time_after - time_before);
check2 = data[50][1495];
// CLOSE FILE 2
fclose(fp2);
// PRINT RESULTS
// appended characters to check for consistency amd accuracy of reads
printf("%s: %f\t%s: %f\t%c%c\n", argv[1], time1, argv[2], time2, check1, check2);
}
else {
puts("Wrong number of args. Put names of 2 files as arguments.");
}
return(0);
}
The output was:
a.txt: 1169.000000 b.txt: 95.000000 vp
b.txt: 826.000000 a.txt: 67.000000 be
a.txt: 1146.000000 b.txt: 91.000000 vp
b.txt: 1139.000000 a.txt: 89.000000 be
a.txt: 821.000000 b.txt: 77.000000 vp
b.txt: 1069.000000 a.txt: 91.000000 be
a.txt: 1141.000000 b.txt: 91.000000 vp
b.txt: 822.000000 a.txt: 70.000000 be
a.txt: 776.000000 b.txt: 68.000000 vp
b.txt: 996.000000 a.txt: 90.000000 be
a.txt: 1143.000000 b.txt: 92.000000 vp
b.txt: 1141.000000 a.txt: 93.000000 be
a.txt: 1138.000000 b.txt: 92.000000 vp
b.txt: 1140.000000 a.txt: 92.000000 be
As you can see, overwhelmingly in favor of fgets(). Like, by an order of magnitude. Around 11 or 12 times.
If you have any useful specific comments about this, or if you would condescend to confirm or deny this research, please do.
Do I get gold now or something?
From your question and answer:
in my example I am reading full lines of known length
If the file has such a simple structure (same line lengths)
the fastest will be -
// ALLOCATE 2D ARRAY MEMORY
char (*data)[CHAR_PERLINE] = malloc(LINES * sizeof(*data);
if(fread(data, sizeof(*data), LINES, fpx) != LINES * sizeof(*data))
{
/* Something gone wrong, do something. */
}
But the lines will not be null character terminated
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.