简体   繁体   English

Python mmap ctypes - 只读

[英]Python mmap ctypes - read only

I think I have the opposite problem as described here .我认为我遇到了与此处所述相反的问题。 I have one process writing data to a log, and I want a second process to read it, but I don't want the 2nd process to be able to modify the contents.我有一个进程将数据写入日志,我希望第二个进程读取它,但我不希望第二个进程能够修改内容。 This is potentially a large file, and I need random access, so I'm using python's mmap module.这可能是一个大文件,我需要随机访问,所以我使用了 python 的 mmap 模块。

If I create the mmap as read/write (for the 2nd process), I have no problem creating ctypes object as a "view" of the mmap object using from_buffer.如果我将 mmap 创建为读/写(用于第二个进程),则使用 from_buffer 创建 ctypes object 作为 mmap object 的“视图”没有问题。 From a cursory look at the c-code, it looks like this is a cast, not a copy, which is what I want.粗略地看一下 c 代码,看起来这是一个演员表,而不是副本,这正是我想要的。 However, this breaks if I make the mmap ACCESS_READ, throwing an exception that from_buffer requires write privileges.但是,如果我制作 mmap ACCESS_READ,这会中断,抛出 from_buffer 需要写入权限的异常。

I think I want to use ctypes from_address() method instead, which doesn't appear to need write access.我想改用 ctypes from_address() 方法,这似乎不需要写访问权限。 I'm probably missing something simple, but I'm not sure how to get the address of the location within an mmap.我可能遗漏了一些简单的东西,但我不确定如何在 mmap 中获取该位置的地址。 I know I can use ACCESS_COPY (so write operations show up in memory, but aren't persisted to disk), but I'd rather keep things read only.我知道我可以使用 ACCESS_COPY (所以写操作出现在 memory 中,但不会持久化到磁盘),但我宁愿保持只读。

Any suggestions?有什么建议么?

Ran into this same problem, we needed the from_buffer interface and wanted read only access.遇到同样的问题,我们需要 from_buffer 接口并且想要只读访问。 From the python docs https://docs.python.org/3/library/mmap.html "Assignment to an ACCESS_COPY memory map affects memory but does not update the underlying file." From the python docs https://docs.python.org/3/library/mmap.html "Assignment to an ACCESS_COPY memory map affects memory but does not update the underlying file." If it's acceptable for you to use an anonymous file backing you can use ACCESS_COPY如果您可以使用匿名文件支持,您可以使用 ACCESS_COPY

An example: open two cmd.exe or terminals and in one terminal:示例:打开两个 cmd.exe 或终端并在一个终端中:

mm_file_write = mmap.mmap(-1, 4096, access=mmap.ACCESS_WRITE, tagname="shmem")
mm_file_read = mmap.mmap(-1, 4096, access=mmap.ACCESS_COPY, tagname="shmem")

write = ctypes.c_int.from_buffer(mm_file_write)
read = ctypes.c_int.from_buffer(mm_file_read)
try:
    while True:
        value = int(input('enter an integer using mm_file_write: '))
        write.value = value
        print('updated value')
        value = int(input('enter an integer using mm_file_read: '))
        #read.value assignment doesnt update anonymous backed file
        read.value = value
        print('updated value')
except KeyboardInterrupt:
    print('got exit event')

In the other terminal do:在另一个终端做:

mm_file = mmap.mmap(-1, 4096, access=mmap.ACCESS_WRITE, tagname="shmem")
i = None
try:
    while True:
        new_i = struct.unpack('i', mm_file[:4])
        if i != new_i:
            print('i: {} => {}'.format(i, new_i))
            i = new_i
        time.sleep(0.1)
except KeyboardInterrupt:
    print('Stopped . . .')

And you will see that the second process does not receive updates when the first process writes using ACCESS_COPY你会看到当第一个进程使用 ACCESS_COPY 写入时,第二个进程没有收到更新

I ran into a similar issue (unable to setup a readonly mmap) but I was using only the python mmap module.我遇到了类似的问题(无法设置只读 mmap),但我只使用了 python mmap 模块。 Python mmap 'Permission denied' on Linux Python mmap Linux 上的“权限被拒绝”

I'm not sure it is of any help to you since you don't want the mmap to be private?我不确定这对您有什么帮助,因为您不希望 mmap 是私有的?

Ok, from looking at the mmap.c code, I don't believe it supports this use case.好的,通过查看 mmap.c 代码,我不相信它支持这个用例。 Also, I found that the performance pretty much sucks - for my use case.此外,我发现性能非常糟糕 - 对于我的用例。 I'd be curious what kind performance others see, but I found that it took about 40 sec to walk through a binary file of 500 MB in Python.我很好奇其他人看到了什么样的性能,但我发现在 Python 中遍历一个 500 MB 的二进制文件大约需要 40 秒。 This is creating a mmap, then turning the location into a ctype object with from_buffer(), and using the ctypes object to decipher the size of the object so I could step to the next object. This is creating a mmap, then turning the location into a ctype object with from_buffer(), and using the ctypes object to decipher the size of the object so I could step to the next object. I tried doing the same thing directly in c++ from msvc.我尝试直接在 msvc 的 c++ 中做同样的事情。 Obviously here I could cast directly into an object of the correct type, and it was fast - less than a second (this is with a core 2 quad and ssd).显然,在这里我可以直接转换为正确类型的 object,而且速度很快 - 不到一秒(这是使用核心 2 quad 和 ssd)。

I did find that I could get a pointer with the following我确实发现我可以通过以下方式获得指针

firstHeader = CEL_HEADER.from_buffer(map, 0) #CEL_HEADER is a ctypes Structure
pHeader = pointer(firstHeader)
#Now I can use pHeader[ind] to get a CEL_HEADER object 
#at an arbitrary point in the file

This doesn't get around the original problem - the mmap isn't read-only, since I still need to use from_buffer for the first call.这并没有解决最初的问题 - mmap 不是只读的,因为我仍然需要使用 from_buffer 进行第一次调用。 In this config, it still took around 40 sec to process the whole file, so it looks like the conversion from a pointer into ctypes structs is killing the performance.在这个配置中,处理整个文件仍然需要大约 40 秒,所以看起来从指针到 ctypes 结构的转换正在扼杀性能。 That's just a guess, but I don't see a lot of value in tracking it down further.这只是一个猜测,但我认为进一步追踪它没有多大价值。

I'm not sure my plan will help anyone else, but I'm going to try to create a c module specific to my needs based on the mmap code.我不确定我的计划是否会帮助其他任何人,但我将尝试根据 mmap 代码创建一个特定于我需要的 c 模块。 I think I can use the fast c-code handling to index the binary file, then expose only small parts of the file at a time through calls into ctypes/python objects.我想我可以使用快速的 c 代码处理来索引二进制文件,然后通过调用 ctypes/python 对象一次只公开文件的一小部分。 Wish me luck.祝我好运。

Also, as a side note, Python 2.7.2 was released today (6/12/11), and one of the changes is an update to the mmap code so that you can use a python long to set the file offset.此外,作为旁注,Python 2.7.2 已于今天(2011 年 6 月 12 日)发布,其中一项更改是对 mmap 代码的更新,以便您可以使用 python long 来设置文件偏移量。 This lets you use mmap for files over 4GB on 32-bit systems.这使您可以在 32 位系统上对超过 4GB 的文件使用 mmap。 See Issue #4681 here请参阅此处的问题 #4681

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM