简体   繁体   English

C将文件内容从EOF复制到SOF

[英]C copy file contents from EOF to SOF

My program is working almost as it should. 我的程序几乎可以正常工作。 The intended purpose is to read the file from the end and copy the contents to destination file. 预期目的是从头读取文件,然后将内容复制到目标文件。 However what confuses me is the lseek() method more so how I should be setting the offset. 但是,令我困惑的是lseek()方法,所以我应该如何设置偏移量。

My src contents at the moment are: 我目前的src内容是:
Line 1 1号线
Line 2 2号线
Line 3 3号线

At the moment what I get in my destination file is: 目前,我在目标文件中得到的是:
Line 3 3号线
e 2 2号
e 2... e 2 ...

From what I understand calling int loc = lseek(src, -10, SEEK_END); 据我了解,调用int loc = lseek(src, -10, SEEK_END); will move the "cursor" in source file to then end then offset it from EOF to SOF for 10 bytes and the value of loc will be the size of file after I have deducted the offset. 将源文件中的“光标”移动到结束,然后将其从EOF偏移到SOF 10个字节,而loc的值将是我减去偏移后的文件大小。 However after 7h of C I'm almost brain dead here. 但是,经过7个小时的C,我在这里几乎死了。

int main(int argc, char* argv[])
{
    // Open source & source file
    int src = open(argv[1], O_RDONLY, 0777);
    int dst = open(argv[2], O_CREAT|O_WRONLY, 0777);

    // Check if either reported an erro
    if(src == -1 || dst == -1)
    {
        perror("There was a problem with one of the files.");
    }

    // Set buffer & block size
    char buffer[1];
    int block;

    // Set offset from EOF
    int offset = -1;

    // Set file pointer location to the end of file
    int loc = lseek(src, offset, SEEK_END);

    // Read from source from EOF to SOF
    while( loc > 0 )
    {
        // Read bytes
        block = read(src, buffer, 1);

        // Write to output file
        write(dst, buffer, block);

        // Move the pointer again
        loc = lseek(src, loc-1, SEEK_SET);
    }

}

lseek() doesn't change or return the file size. lseek()不会更改或返回文件大小。 What it returns is the position where the 'cursor' is set to. 返回内容,其中“光标”设置为位置 So when you call 所以当你打电话

loc = lseek(src, offset, SEEK_END);

twice it will always set the cursor to the same position again. 两次,它将始终再次将光标设置到相同位置。 I guess you want to do something like this: 我猜你想做这样的事情:

while( loc > 0 )
{
    // Read bytes
    block = read(src, buffer, 5);

    // Write to output file
    write(dst, buffer, block);

    // Move the pointer again five bytes before the last offset
    loc = lseek(src, loc+offset, SEEK_SET);
}

If the line length is variable, you could do something like the following instead: 如果行长可变,则可以执行以下操作:

// define an offset that exceeds the maximum line length
int offset = 256;
char buffer[256];
// determine the file size
off_t size = lseek( src, 0, SEEK_END );
off_t pos = size;
// read block of offset bytes from the end
while( pos > 0 ) {
    pos -= offset;
    if( pos < 0 ) {
        //pos must not be negative ...
        offset += pos;   // in fact decrements offset!!
        pos = 0;
    }
    lseek( src, pos, SEEK_SET );
    // add error checking here!!
    read(src, buffer, offset );
    // we expect the last byte read to be a newline but we are interested in the one BEFORE that
    char *p = memchr( buffer, '\n', offset-1 );
    p++;  // the beginning of the last line
    int len = offset - (p-buffer);  // and its length
    write( dst, p, len );
    pos -= len;            // repeat with offset bytes before the last line
}

I think you should be using SEEK_CUR instead of SEEK_END in your final call to lseek() : 我想你应该使用SEEK_CUR ,而不是SEEK_END在您的最终调用lseek()

// Set file pointer location to the end of file
int loc = lseek(src, offset, SEEK_END);

// Read from source from EOF to SOF
while( loc > 0 )
{
    // Read bytes
    block = read(src, buffer, 5);

    // Write to output file
    write(dst, buffer, block);

    // Move the pointer again
    lseek(src, -10, SEEK_CUR);
}

You could also do: 您也可以这样做:

// Set file pointer location to the end of file
int loc = lseek(src, offset, SEEK_END);

// Read from source from EOF to SOF
while( loc > 0 )
{
    // Read bytes
    block = read(src, buffer, 5);

    // Write to output file
    write(dst, buffer, block);

    // Move the pointer again
    loc -= 5;
    lseek(src, loc, SEEK_SET);
}

From some of your comments it looks like you want to reverse the order of the lines in a text file. 从您的一些评论看来,您想要颠倒文本文件中各的顺序。 Unfortunately you're not going to get that with such a simple program. 不幸的是,您不会通过这样一个简单的程序来实现这一点。 There are several approaches you can take, depending on how complicated you want to get, how big the files are, how much memory is on hand, how fast you want it to be, etc. 您可以采取几种方法,具体取决于您要变得多么复杂,文件有多大,手头有多少内存,想要多快等。

Here are some different ideas off the top of my head: 这是我想到的一些不同的想法:

  • Read your whole source file at once into a single memory block. 一次将整个源文件读取到一个存储块中。 Scan through the memory block forwards looking for line breaks and recording the pointer and length for each line. 向前扫描存储块以查找换行符,并记录每行的指针和长度。 Save these records onto a stack (you could use a dynamic array, or an STL vector in C++,) and then to write your output file, you just pop a line's record off the stack (moving backwards through the array) and write it until the stack is empty (you've reached the beginning of the array.) 将这些记录保存到堆栈中(您可以使用动态数组或C ++中的STL向量),然后编写输出文件,只需从堆栈中弹出一行记录(在数组中向后移动)并写入,直到堆栈是空的(您已经到达数组的开头。)

  • Start at the end of your input file, but for each line, seek backwards character-by-character until you find the newline that starts the previous line. 从输入文件的末尾开始,但是对于每一行,请逐个字符地向后搜索,直到找到开始一行的换行为止。 Seek forwards again past that newline and then read in the line. 再次向前搜索该换行符,然后读入该行。 (You should now know its length.) Or, you could just build up the reversed characters in a buffer and then write them out backwards. (您现在应该知道它的长度。)或者,您可以在缓冲区中建立反向字符,然后将它们向后写出。

  • Pull in whole blocks (sectors perhaps) of the file at once, from end to beginning. 从头到尾一次拉入整个文件块(可能是扇区)。 Within each block, locate the newlines in a similar fashion to the method above except now you already have the characters in memory and so don't need to do any reversing or pulling them in redundantly. 在每个块中,以与上述方法类似的方式定位换行符,除了现在您已经在存储器中存储了字符,因此无需进行任何反向或多余地拉入字符。 However, this solution will be much more complicated because lines can span across block boundaries. 但是,此解决方案将更加复杂,因为行可以跨越块边界。

There may be more elaborate/clever tricks, but those are the more obvious, straightforward approaches. 可能会有更多精心设计/巧妙的把戏,但这是更明显,直接的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM