简体   繁体   English

C-函数read(文件,缓冲区,要读取的字节)破坏字符串

[英]C - Function read(file,buffer,bytes to read) breaking a string

I'm trying to read a file with 1024 lines with 9 times the same letter in each line and returning if it finds a line that doesn't match this terms. 我正在尝试读取包含1024行的文件,每行有9倍相同的字母,如果找到的行与此条件不匹配,则返回该文件。

The file is as follow but with 1024 lines: 该文件如下,但具有1024行:

eeeeeeeee
eeeeeeeee
eeeeeeeee

Code: 码:

fd = open(fileName, O_RDONLY);
lseek(fd,0,SEEK_SET);


if(flock(fd, LOCK_SH) == -1)
        perror("error on file lock");

if(fd != 0){

    read(fd, lineFromFile, (sizeof(char)*10));
    arguments->charRead = lineFromFile[0];

    for(i=0; i < 1024; i++){        
        var = read(fd, toReadFromFile, (sizeof(char)*10));  
        if(strncmp(toReadFromFile,lineFromFile,10) != 0 || var < 10){           

            arguments->result = -1;
            printf("%s \n\n",toReadFromFile);
            printf("%s \n",lineFromFile);
            printf("i %d var %d  \n",i,var);                
            free(toReadFromFile);
            free(lineFromFile);
            return ;
        }                       
    }
}

Output: 输出:

> eeeee
eeee 

eeeee
eeee 
i 954 var 6 

I have 5 different files with different letters and every single one gives this output in that specific line (954) and the line is correct with the letter writen 9 times with a \\n in the end. 我有5个带有不同字母的文件,每个文件在该特定行(954)中都给出此输出,该行是正确的,该字母写了9次,最后带有\\ n。

Any ideas why this could be happening? 任何想法为什么会发生这种情况? If i don't use the lseek it works fine but i need the lseek to divide the file in several parts to be tested by different threads. 如果我不使用lseek,它可以正常工作,但是我需要lseek将文件分为几个部分,以通过不同的线程进行测试。 I put the 0 index in the lseek for simplification to show you guys. 为了简化起见,我将0索引放在lseek中以示大家。

Thanks. 谢谢。

It looks like you are looking for "eeeee\\neeee" instead of "eeeeeeeee\\n" . 看起来您正在寻找"eeeee\\neeee"而不是"eeeeeeeee\\n" Which means your file should should start like this: 这意味着您的文件应该这样开始:

eeeee
eeeeeeeee
eeeeeeeee

and end like this: 像这样结束:

eeeeeeeee
eeee

If your file ends like this: 如果文件以这种方式结束:

eeeeeeeee
eeeeeeeee

Then when you get to the last line, it will fail because you will only read "eeeee\\n" instead of "eeeee\\neeee" . 然后,当您到达最后一行时,它将失败,因为您只会读取"eeeee\\n"而不是"eeeee\\neeee"

Given the new information in your comment, I believe the problem is that you should not be seeking to the middle of lines (in this case 342 and 684). 鉴于您评论中的新信息,我相信问题是您不应该寻求中间的问题(在本例中为342和684)。 You should seek to an even multiple of the expected string (like 340 and 680). 您应该寻求预期字符串的偶数倍(例如340和680)。 Also, line 954 is not where the problem happened. 同样,第954行也不是问题发生的地方。 It should be line 954 + X, where X is the line you seeked to. 它应该是954 + X行,其中X是您要搜索的行。

Whatever other problems your program may have, it certainly has this: the read() function is not guaranteed to read the full number of bytes requested. 无论您的程序有任何其他问题,它肯定具有以下特点: read()函数不能保证读取请求的全部字节数。 It will read at least one unless it encounters an error or the end of the file, and under many circumstances it does read the full number of bytes requested, but even when there are enough bytes remaining before the end of the file, read() may read fewer bytes than requested. 除非遇到错误或文件末尾,否则它将至少读取一个,并且在许多情况下,它会读取请求的全部字节数,但是即使文件末尾还有足够的字节数,也要read()读取的字节可能少于请求的字节。

The comments urging you to use a higher-level function instead are well considered, but if you are for some reason obligated to use read() then you must watch for cases where fewer bytes are read than requested, and handle them by reading additional bytes into the unused tail end of the buffer. 强烈建议您考虑使用更高级别的函数的注释,但是如果由于某种原因您不得不使用read()则必须注意读取的字节数少于请求的字节数,并通过读取其他字节来处理它们进入缓冲区未使用的尾端。 Possibly multiple times. 可能多次。

In function form, that might look like this: 在函数形式中,可能如下所示:

int read_all(int fd, char buf[], int num_to_read) {
    int total_read = 0;
    int n_read = 0;

    while (total_read < num_to_read) {
        n_read = read(fd, buf + total_read, num_to_read - total_read);
        if (n_read > 0) {
            total_read += n_read;
        } else {
            break;
        }
    }

    return (n_read < 0) ? n_read : total_read;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM