[英]joining big files in Python
我有幾個要合並的 HEVEC 文件。 對於小文件(大約 1.5 GB),以下代碼可以正常工作
with open(path+"/"+str(sys.argv[2])+"_EL.265", "wb") as outfile:
for fname in dirs:
with open(path+"/"+fname, 'rb') as infile:
outfile.write(infile.read())
對於更大的文件(8 GB 或更多),相同的代碼會卡住。 我從這里( Python 中的讀取大文件的惰性方法?)復制了用於讀取大文件的代碼,並將其與我的代碼集成:
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
with open(path + "/" + str(sys.argv[2]) + "_BL.265", "wb") as outfile_bl:
for fname in dirs:
with open(path+"/"+fname, 'rb') as infile:
for piece in read_in_chunks(infile):
outfile_bl.write(infile.read())
此代碼生成大小合適的文件,但它不再是 HEVC 文件,視頻播放器無法讀取。
任何想法? 請幫忙
達里奧
您在兩個不同的地方從infile
讀取:在read_in_chunks
內部,以及在您調用outfile_bl
時直接讀取。 這會導致您跳過將剛剛讀取的數據寫入變量piece
,因此您只復制了大約一半的文件。
您已經將數據讀入piece
; 只需將其寫入您的文件即可。
with open(path + "/" + str(sys.argv[2]) + "_BL.265", "wb") as outfile_bl:
for fname in dirs:
with open(path+"/"+fname, 'rb') as infile:
for piece in read_in_chunks(infile):
outfile_bl.write(piece)
順便說一句,您實際上不需要定義read_in_chunks
,或者至少可以通過使用iter
大大簡化其定義:
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
yield from iter(lambda: file_object.read(chunk_size), '')
# Or
# from functools import partial
# yield from iter(partial(file_object.read, chunk_size), '')
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.