简体   繁体   English

使用fread和fseek读取文本文件中特定数量的字符

[英]read a specific number of characters in the text file using fread and fseek

Let's say I have a text file like this 假设我有一个这样的文本文件

This is a text file which contains some numbers. 这是一个包含一些数字的文本文件。

So I want to use fseek and fread to read some parts of the text file. 因此,我想使用fseek和fread来读取文本文件的某些部分。 For example, from position 0 to 13, I'll get "This is a text". 例如,从位置0到13,我将得到“这是一个文本”。 Then from position 14 to 24, I'll get " file which", then from position 25 to the end of file, I'll get " contains some numbers." 然后从位置14到24,我将得到“ file which”,然后从位置25到文件的末尾,我将得到“包含一些数字”。

I've tried to use fseek and fread but I've got some additional weird characters like "This is a text?" 我尝试使用fseek和fread,但是我还有其他一些奇怪的字符,例如“这是文本?”

My attempt to use fseek and fread: 我尝试使用fseek和fread:

src = fopen(textfile, "r");
int chunksize = data[i].end - data[i].start;
char *buffer = malloc(sizeof(chunksize));

seek(src, data[i].start, SEEK_SET);
fread(buffer, 1, chunksize, src);
fseek(src, 0, SEEK_SET); // seek back to beginning of file      

where data[i].start is the start position to read to part and data[i].end is the end position to stop. 其中data[i].start是读取到零件的开始位置, data[i].end是读取的结束位置。 For example, from 14 to 24, I'll get " file which". 例如,从14到24,我将获得“ file which”。 start is 14 and end is 25. 开始是14,结束是25。

You need to rewrite your code approximately as follows: 您大约需要重写代码,如下所示:

src = fopen(textfile, "r");
int chunksize = ...
char *buffer = malloc(chunksize + 1);
fseek(src, data[i].start, src);
int len = fread(buffer, 1, chunksize, src);
*(buffer+len) = '\0';

So, we have the following, a buffer containing what was read from the file. 因此,我们有了以下缓冲区,其中包含从文件读取的内容。 We have a string terminator following the content read (which may be less than you asked for). 在读取的内容后面有一个字符串终止符(可能比您要求的要少)。 If you now issue: 如果您现在发出:

print("%s\n", buffer)

You will get exactly what you read from the file. 您将获得从文件中读取的内容。

PS: It is a good idea to check the return from the fopen() to be sure that the file opened properly, and the return from malloc() to ensure that memory was allocated successfully, and the return from fread() to ensure that the correct amount of data was read. PS:这是检查从返回的是一个好主意fopen()以确保文件正确打开,并从返回malloc()以确保成功地分配内存,并从返回fread()以确保读取了正确数量的数据。

Random characters after the end of the data arise from not null terminating the input string, or not limiting the output to the data that was read. 数据结尾之后的随机字符是由于输入字符串不为空而终止,或者不是将输出限制为读取的数据。 fread() does not 'null terminate' what it reads; fread()不会“ null终止”它读取的内容; it would be useless if it did. 如果这样做的话,那将毫无用处。

You are allocating too little memory. 您分配的内存太少。 You have: 你有:

char *buffer = malloc(sizeof(chunksize));

You need: 你需要:

char *buffer = malloc(chunksize);

Or you could allocate an extra byte and store a null byte '\\0' in that. 或者,您可以分配一个额外的字节并在其中存储一个空字节'\\0' If you're going to need to pass it to code that needs a string, this is better: 如果您需要将其传递给需要字符串的代码,则更好:

char *buffer = malloc(chunksize + 1);

You should be checking the error returns of most functions; 您应该检查大多数函数的错误返回。 specifically, you should be paying attention to malloc() , fopen() and fread() . 具体来说,您应该注意malloc()fopen()fread() You also need to use fseek() , not seek() . 您还需要使用fseek() ,而不是seek()

You might use: 您可以使用:

src = fopen(textfile, "r");
if (src == 0)
    err_exit("Failed to open %s for reading\n", textfile);

int chunksize = data[i].end - data[i].start;
char *buffer = malloc(chunksize + 1);
if (buffer == 0)
    err_exit("Failed to allocate %d bytes memory\n", chunksize);

fseek(src, data[i].start, SEEK_SET);
size_t nbytes = fread(buffer, 1, chunksize, src);
fseek(src, 0, SEEK_SET);
buffer[nbytes] = '\0';
if (nbytes != 0)
    printf("Read: <<%.*s>>\n", (int)nbytes, buffer);

or: 要么:

    prinf("Read: <<%s>>\n", buffer);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM