加入 Python 中的大文件

Question

我有幾個要合並的 HEVEC 文件。 對於小文件（大約 1.5 GB），以下代碼可以正常工作

with open(path+"/"+str(sys.argv[2])+"_EL.265", "wb") as outfile:
        for fname in dirs:
                with open(path+"/"+fname, 'rb') as infile:
                    outfile.write(infile.read())

對於更大的文件（8 GB 或更多），相同的代碼會卡住。 我從這里（ Python 中的讀取大文件的惰性方法？）復制了用於讀取大文件的代碼，並將其與我的代碼集成：

def read_in_chunks(file_object, chunk_size=1024):
    """Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1k."""
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data


with open(path + "/" + str(sys.argv[2]) + "_BL.265", "wb") as outfile_bl:
        for fname in dirs:
                    with open(path+"/"+fname, 'rb') as infile:
                            for piece in read_in_chunks(infile):
                                outfile_bl.write(infile.read())

此代碼生成大小合適的文件，但它不再是 HEVC 文件，視頻播放器無法讀取。

任何想法？ 請幫忙

達里奧

Answer 1

您在兩個不同的地方從infile讀取：在read_in_chunks內部，以及在您調用outfile_bl時直接讀取。 這會導致您跳過將剛剛讀取的數據寫入變量piece ，因此您只復制了大約一半的文件。

您已經將數據讀入piece ； 只需將其寫入您的文件即可。

with open(path + "/" + str(sys.argv[2]) + "_BL.265", "wb") as outfile_bl:
    for fname in dirs:
        with open(path+"/"+fname, 'rb') as infile:
            for piece in read_in_chunks(infile):
                outfile_bl.write(piece)

順便說一句，您實際上不需要定義read_in_chunks ，或者至少可以通過使用iter大大簡化其定義：

def read_in_chunks(file_object, chunk_size=1024):
    """Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1k."""

    yield from iter(lambda: file_object.read(chunk_size), '')

    # Or
    # from functools import partial
    # yield from iter(partial(file_object.read, chunk_size), '')

加入 Python 中的大文件

問題描述

1 個解決方案

解決方案1
1 已采納 2022-02-03 14:58:47

加入 Python 中的大文件

問題描述

1 個解決方案

解決方案1 1 已采納 2022-02-03 14:58:47

解決方案1
1 已采納 2022-02-03 14:58:47