简体   繁体   English

并行读取内存映射文件中的数据?

[英]Reading data from memory mapped file in parallel?

If mapped file data is fully resident in physical memory will there be any benefit of reading data in parallel for example by defining number of sections with start/end byte and have separate thread working each section? 如果映射文件数据完全驻留在物理内存中,那么并行读取数据有什么好处,例如通过定义具有开始/结束字节的段数并且每个段都有单独的线程? The goal is to allow for frequent quick reads of data from big binary file. 目标是允许频繁快速读取大二进制文件中的数据。

I've been doing some tests (Java NIO) where each thread (testing with 4 threads) has access to reference of mmap but since each thread changes internal pointer in mmaped file to read next set of bytes this doesn't seem safe. 我一直在做一些测试(Java NIO),其中每个线程(用4个线程测试)都可以访问mmap的引用但是由于每个线程都改变了mmaped文件中的内部指针以读取下一组字节,这似乎不安全。 I'm thinking about splitting a file into 4 mmaped chunks for each thread? 我正在考虑为每个线程将文件拆分为4个mmaped块?

UPDATE: To give more context ultimately what I'm going after is having a data structure that will have a reference to number of mmaped files so then those references can be provided to some function that will do a loop scan testing for values and putting them into byte buffer. 更新:为了给出更多的上下文,我最终得到的是拥有一个数据结构,该数据结构将引用mmaped文件的数量,以便那些引用可以提供给某些函数,这些函数将对值进行循环扫描测试并放置它们进入字节缓冲区。

UPDATE: This is for read-only files. 更新:这是只读文件。

You can create different FileChannel for each thread. 您可以为每个线程创建不同的FileChannel。 Each Thread will read a different part. 每个线程将读取不同的部分。

As documentation says, FileChannels are thread-safe. 正如文档所说,FileChannels是线程安全的。

Your code would be something like this 你的代码就是这样的

package nio;

import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class HughTest {

    public static void main(String[] args) {

        try (FileChannel inChannel = new RandomAccessFile("file_Path", "r").getChannel()){

            // TODO Change in each thread the chunk size to read
            long fileSize = inChannel.size();
            ByteBuffer buffer = ByteBuffer.allocate((int) fileSize);
            inChannel.read(buffer);
            buffer.flip();
            // Do what you want

            inChannel.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

This code reads a file in a unique thread, you would have adapt the code inside a runnable class and pass the FileChannel size in constructor or elsewhere to read the entire file in parallel as described in this question: Can I seek a file from different threads independently using FileChannel? 此代码在唯一线程中读取文件,您可以在可运行的类中调整代码并在构造函数或其他位置传递FileChannel大小以并行读取整个文件,如此问题中所述: 我可以从不同的线程中查找文件独立使用FileChannel?

UPDATE UPDATE

Unfortunately MappedByteBuffer is not thread safe as it is a subclass of Buffer, as you can see here: Does memory mapped file support concurrent get/put? 不幸的是,MappedByteBuffer不是线程安全的,因为它是Buffer的子类,你可以在这里看到: 内存映射文件是否支持并发get / put? so you have to use a synchronize mechanism in order to do it in parallel. 所以你必须使用同步机制才能并行执行。

One approach would be copy the entire file in a temporal one (this way you ensure that the file will never be modified), and then use a runnable implementation like this 一种方法是以时间方式复制整个文件(这样可以确保永远不会修改文件),然后使用像这样的可运行实现

   private class ThreadFileRead implements Runnable {

        private final long ini;
        private final long end;

        public ThreadFileRead(long ini, long end) {
            this.ini = ini;
            this.end = end;
        }

        @Override
        public void run() {
            MappedByteBuffer out = null;

            try {
                out = new RandomAccessFile("FILEPATH", "r").
                        getChannel().map(FileChannel.MapMode.READ_ONLY, ini, end);

                for (long i = ini; i < end; i++)
                {
                    // do work
                }


            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

        }

    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM