[英]Parsing Mbox from an open file-like object in Python?
這有效:
import mailbox
x = mailbox.mbox('filename.mbox') # works
但是如果我只有一個打開的文件句柄而不是文件名怎么辦?
fp = open('filename.mbox', mode='rb') # for example; there are many ways to get a file-like object
x = mailbox.mbox(fp) # doesn't work
問題 :從字節流中打開Mbox的最佳(最干凈,最快)方法是什么?一個開放的二進制句柄,而不是先將字節復制到命名文件中?
mailbox.mbox()
必須在某個時刻調用內置函數open()
。 因此,一個hacky解決方案是攔截該調用並返回預先存在的類文件對象。 草案解決方案如下:
import builtins
# FLO stands for file-like object
class MboxFromFLO:
def __init__(self, flo):
original_open = builtins.open
fake_path = '/tmp/MboxFromFLO'
self.fake_path = fake_path
def open_proxy(*args):
print('open_proxy{} was called:'.format(args))
if args[0] == fake_path:
print('Call to open() was intercepted')
return flo
else:
print('Call to open() was let through')
return original_open(*args)
self.original_open = original_open
builtins.open = open_proxy
print('Instrumenting open()')
def __enter__(self):
return mailbox.mbox(self.fake_path)
def __exit__(self, exc_type, exc_value, traceback):
print('Restoring open()')
builtins.open = self.original_open
# Demonstration
import mailbox
# Create an mbox file so that we can use it later
b = mailbox.mbox('test.mbox')
key = b.add('This is a MboxFromFLO test message')
f = open('test.mbox', 'rb')
with MboxFromFLO(f) as b:
print('Msg#{}:'.format(key), b.get(key))
關於mailbox.mbox
實現可能的未來變化的一些警告:
除了傳遞給構造函數的文件之外, mailbox.mbox
還可以打開額外的文件。 即使它沒有,猴子修補的open()
將被補丁生效時執行的任何其他Python代碼使用(即,只要由MboxFromFLO
管理的上下文處於活動狀態)。 您必須確保生成的假路徑(以便以后識別正確的open()
調用open()
如果有多個此類調用))不會與任何此類文件沖突。
mailbox.mbox
可能會在打開之前決定以某種方式檢查指定的路徑(例如使用os.path.exists()
, os.path.isfile()
等),如果該路徑不存在則會失敗。
你可以繼承mailbox.mbox。 可以在github上找到標准庫的源代碼。
邏輯似乎主要在超類_singlefileMailbox
。
class _singlefileMailbox(Mailbox):
"""A single-file mailbox."""
def __init__(self, path, factory=None, create=True):
"""Initialize a single-file mailbox."""
Mailbox.__init__(self, path, factory, create)
try:
f = open(self._path, 'rb+')
except OSError as e:
if e.errno == errno.ENOENT:
if create:
f = open(self._path, 'wb+')
else:
raise NoSuchMailboxError(self._path)
elif e.errno in (errno.EACCES, errno.EROFS):
f = open(self._path, 'rb')
else:
raise
self._file = f
self._toc = None
self._next_key = 0
self._pending = False # No changes require rewriting the file.
self._pending_sync = False # No need to sync the file
self._locked = False
self._file_length = None # Used to record mailbox size
所以我們可以嘗試擺脫open()邏輯,並從mbox和其他超類中替換init代碼。
class CustomMbox(mailbox.mbox):
"""A custom mbox mailbox from a file like object."""
def __init__(self, fp, factory=None, create=True):
"""Initialize mbox mailbox from a file-like object."""
# from `mailbox.mbox`
self._message_factory = mailbox.mboxMessage
# from `mailbox._singlefileMailbox`
self._file = fp
self._toc = None
self._next_key = 0
self._pending = False # No changes require rewriting the file.
self._pending_sync = False # No need to sync the file
self._locked = False
self._file_length = None # Used to record mailbox size
# from `mailbox.Mailbox`
self._factory = factory
@property
def _path(self):
# If we try to use some functionality that relies on knowing
# the original path, raise an error.
raise NotImplementedError('This class does not have a file path')
def flush(self):
"""Write any pending changes to disk."""
# _singlefileMailbox has quite complicated flush method.
# Hopefully this will work fine.
self._file.flush()
這可能是一個開始。 但您可能必須定義其他方法才能獲得其他郵箱類的完整功能。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.