简体   繁体   English

如何避免文件内容重复 zipfile

[英]How to avoid file content repetition zipfile

I need to compress multiple xml files and I achieved this with lxml, zipfile and a for loop.我需要压缩多个 xml 文件,我使用 lxml、zipfile 和 for 循环实现了这一点。

My problem is that every time I re run my function the content of the compressed files are repeating (being appended in the end) and getting longer.我的问题是,每次我重新运行 function 时,压缩文件的内容都会重复(最后附加)并且越来越长。 I believe that it has to do with the writing mode a+b.我认为这与 a+b 的写作模式有关。 I thought that by using with open at the end of the code block the files would be deleted and no more content would be added to them.我认为通过在代码块末尾使用 with open 文件将被删除,并且不会向其中添加更多内容。 I was wrong and with the other modes I do not get the intended result.我错了,使用其他模式我没有得到预期的结果。

Here is my code:这是我的代码:

def compress_package_file(self):
   bytes_buffer = BytesIO()
   with zipfile.ZipFile(bytes_buffer, 'w') as invoices_package:
       i = 1
       for invoice in record.invoice_ids.sorted('sin_number'):
           invoice_file_name = 'Invoice_' + invoice.number + '.xml'
           with open(invoice_file_name, 'a+b') as invoice_file:
               invoice_file.write(invoice._get_invoice_xml().getvalue())
               invoices_package.write(invoice_file_name, compress_type=zipfile.ZIP_DEFLATED)
           i += 1
   compressed_package = bytes_buffer.getvalue()
   encoded_compressed_file = base64.b64encode(compressed_package)               

My xml generator is in another function and works fine.我的 xml 发电机在另一个 function 中并且工作正常。 But the content repeats each time I run this function. For example if I run it two times, the content of the files in the compressed file look something like this (simplified content):但是每次运行这个 function 时,内容都会重复。例如,如果我运行两次,压缩文件中的文件内容如下所示(简化内容):

<?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta><?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta>

If I use w+b mode, the content of the files are blank.如果我使用 w+b 模式,文件的内容是空白的。 How should my code look like to avoid this behavior?我的代码应该如何避免这种行为?

I suggest you do use w+b mode, but move writing to zipfile after closing the invoice XML file.我建议您使用 w+b 模式,但在关闭发票 XML 文件后将写入移动到 zipfile。

From what you wrote it looks as you are trying to compress a file that is not yet flushed to disk, therefore with w+b it is still empty at time of compression.从您写的内容来看,您正在尝试压缩尚未刷新到磁盘的文件,因此使用 w+b 它在压缩时仍然是空的。

So, try remove 1 level of indent for invoices_package.write line (I can't format code properly on mobile, so can't post whole section).因此,请尝试为 invoices_package.write 行删除 1 级缩进(我无法在移动设备上正确格式化代码,因此无法发布整个部分)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM