简体   繁体   English

文件循环中的mmap

[英]mmap from file in loop

Im trying to work with mmap reading from file in loop, Im have file contains info about 3 parts, first part is size of 3*sizeof(double), second one also in the size of 3*sizeof(double), and the third with size of sizeof(double). 我正在尝试使用mmap循环读取文件,我有文件包含有关3部分的信息,第一部分是3 * sizeof(double)的大小,第二部分也是3 * sizeof(double)的大小,第三部分的大小为sizeof(double)。 At the first part of the file I have HEADER with size of 32768 bytes. 在文件的第一部分,我有HEADER,大小为32768字节。 The file organised: 该文件组织:

HEADER||Part(1),Part(1)....Part(1)||Part(2),Part(2)....Part(2)||Part(3),Part(3)....Part(3)| HEADER ||的部分(1),第一部分(1)....的部分(1)||部分(2),第一部分(2)....部分(2)||部分(3),第一部分(3) ....第(3)|

Each part I have 100 times. 我每个部分都有100次。 I want to work each time with 30 parts (10 parts from each part). 我想每次使用30个零件(每个零件10个零件)工作。

I have tried this code: 我已经试过这段代码:

void readingFile(FILE *file, double *a, double *b, double *c, int start, int end, int chunksz, long total)
{
    int i = 0;
    int size = end - start + 1;
    int fd;
    fd = fileno(file);
    off_t fullsize = lseek(fd,SEEK_CUR,SEEK_END); //getting the file size
    fullsize-=1;//the lseek gives one more byte, its ok!
    unsigned long summ = (unsigned long)(start-1)*chunksz; //chunk is 56
    summ+=(unsigned long)HEADER_SIZE;//offset the header size
    unsigned long paramm=(unsigned long)((unsigned long)summ/(unsigned long)(sysconf(_SC_PAGE_SIZE)));
    unsigned long param = floor(paramm);
    void *buf=NULL;
    buf =mmap(NULL,fullsize , PROT_READ, MAP_PRIVATE , fd, param*sysconf(_SC_PAGE_SIZE));
    if(buf==MAP_FAILED)
    {
        printf("we have an error\n");
    }
    unsigned long gapp = (sysconf(_SC_PAGE_SIZE))*param;
    unsigned long gap =summ-gapp;
    buf+=gap;
    memcpy(a,buf,3*sizeof(double)*size);
    buf+=(unsigned long)((long)total-(start-1))*3*sizeof(double);
    buf+=((start-1)*3*sizeof(double));
    memcpy(b,buf,3*sizeof(double)*size);
    buf+=(unsigned long)((long)total-(start-1))*3*sizeof(double);
    buf+=((start-1)*sizeof(double));
    memcpy(c,buf,sizeof(double)*size);
    munmap(buf, fullsize);
    return;
}

Somewhere in the way I have Overflow and the program crashing! 在某种方式我有溢出和程序崩溃! Each time the function being called, A new memory is allotted properly to a,b,c. 每次调用该函数时,都会为a,b,c适当分配一个新的内存。 What is worng here? 这是什么东西? The process crashed at iteration number 14 in line: 该进程在第14行的迭代编号处崩溃:

memcpy(c,buf,sizeof(double)*size);

Thanks! 谢谢!

I know that answering a question with source code is not familiar. 我知道用源代码回答问题并不熟悉。 But i try to expaint what a usefull thing is the mmap. 但是我尝试将mmap涂在什么有用的东西上。 Basically mmap use the kernel capabilities to load (and write back from) the file content into a memory region. 基本上,mmap使用内核功能将文件内容加载(或写回)到内存区域。 so we don't need to frequently call read/seek that can make your application more effective. 因此,我们不需要经常调用读取/搜索即可提高您的应用程序的效率。 In other hand it is a confortable solution to access your data directly, just see the code: 另一方面,直接访问数据是一种舒适的解决方案,只需查看代码即可:

#include <unistd.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

struct mapping
{
        void* start_addr;
        size_t length;
        int fd;
};

struct mapping* map_file(const char* file)
{
        struct mapping* ret = malloc(sizeof(struct mapping));
        if(NULL == ret)
        {
                printf("Can't allocate memory for struct mapping.\n");
                return NULL;
        }

        ret->fd = open(file, O_RDONLY);
        if(0 > ret->fd)
        {
                perror("can't open specified file.");
                free(ret);
                return NULL;
        }

        struct stat fs;
        if(0 != fstat(ret->fd, &fs))
        {
                perror("can't specify file size.");
                close(ret->fd);
                free(ret);
                return NULL;
        }

        ret->length = fs.st_size;

        //offset means offset in file
        ret->start_addr = mmap(NULL, ret->length, PROT_READ, MAP_PRIVATE, ret->fd, 0);
        if(MAP_FAILED == ret->start_addr)
        {
                perror("Mapping file failed.");
                close(ret->fd);
                free(ret);
                return NULL;
        }

        return ret;
}

//returns zero on success and free the `struct mapping` data
int unmap_file(struct mapping* mmf)
{
        //note that now we use read only mapping
        //if you want to write this memory pages
        //before detach maybe you have to call:
        //msync(mmf->start_addr, mmf->length, MS_SYNC);
        // avoid data loss (write all dirty page into file).

        if(NULL != mmf->start_addr)
        {
                if(0 != munmap(mmf->start_addr, mmf->length))
                {
                        perror("Can't munmap file.");
                        return 1;
                }
        }

        mmf->start_addr = NULL;
        if(-1 != mmf->fd)
        {
                if(0 != close(mmf->fd))
                {
                        perror("can't close file descriptor.");
                        return 2;
                }
        }

        free(mmf);

        return 0;
}

// for test#define  MAGIC_START_INDEX 0
#define  MAGIC_START_INDEX 32768

int main(int arg_length, char** args)
{
        if(arg_length < 2)
        {
                printf("No input file specified.\n");
                exit(1);
        }

        int i = 0;
        //first argument is the name of program
        while(++i < arg_length)
        {
                struct mapping* mmf = map_file(args[i]);
                if(NULL == mmf)
                {
                        printf("can't use %s for input file\n", args[i]);
                        continue;
                }

                if(mmf->length >  MAGIC_START_INDEX)
                {
                        //upper base
                        int max_index = (mmf->length - MAGIC_START_INDEX) / sizeof(double);

                        //an offset alias for start memory address
                        double* data = ((double*)(mmf->start_addr + MAGIC_START_INDEX));

                        int ni = 0;
                        while(ni+2 < max_index)
                        {
                                printf("num0: %f, num1: %f, num2: %f\n", data[ni], data[ni+1], data[ni+2]);
                                ni += 3;
                        }
                }
                else
                {
                        printf("File: %s has no valuable data.", args[i]);
                }

                unmap_file(mmf);
        }
}

In the main, you see, we can use directly memory addresses instead of repeat read operation. 基本上,您会看到,我们可以直接使用内存地址,而不是重复读取操作。 This is a sample code what i copied (mapping related data stored in structure and related functions take care about creating/releace file mapping). 这是我复制的示例代码(映射存储在结构和相关函数中的相关数据时要注意创建/创建文件映射)。 That reading can be more lazy, just open the file, read the size (fstat) if i has valuable data, use mmap's offset parameter to skip the header section in the file: 读取可能更懒,只要打开文件,如果我有宝贵的数据,则读取大小(fstat),请使用mmap的offset参数跳过文件中的标题部分:

double[] data = (double*)(mmap(NULL, file_length, PROT_READ, MAP_PRIVATE, fd, MAGIC_START_INDEX)); //TODO check null.

and you get "an instant" access to the data. 这样您就可以“即时”访问数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM