简体   繁体   English

用Python 2.6+和3编写二进制文件

[英]Writing binary files in Python 2.6+ and 3

I am writing a program that needs to write a binary file. 我正在编写一个需要编写二进制文件的程序。 In contains a header of strings (key-value pairs) and numeric data (which can be little-endian or big-endian) and I am looking for a way to write a program that runs unchanged in Python 2.6+ and Python 3.2+. 在其中包含字符串(键-值对)和数字数据(可以是little-endian或big-endian)的标头,我正在寻找一种编写在Python 2.6+和Python 3.2+中运行不变的程序的方法。

Can anyone suggest some best practices? 谁能建议一些最佳做法? Additionally, what is the right way to deal with endianness without cluttering the logic of my program with struct.pack . 另外,在不干扰struct.pack程序逻辑的情况下,处理字节序的正确方法是什么。 Should I subclass BufferedWriter ? 我应该将BufferedWriter子类化吗?

Thanks in advance. 提前致谢。

Once you created the binary data, you just write it to a file opened in binary mode. 创建二进制数据后,只需将其写入以二进制模式打开的文件中即可。 That's all there is to it. 这里的所有都是它的。 No compatibility problems between Python 2 and 3 there at all. 在那里,Python 2和3之间完全没有兼容性问题。

Subclassing BufferedWriter is completely unecessary. 子类化BufferedWriter是完全不必要的。

How you create the data is a different question, but again there I don't see any obvious incompatibility problems. 如何创建数据是一个不同的问题,但是在那我再也看不到任何明显的不兼容问题。

尚不清楚您是否已经具有需要匹配的预定义二进制文件格式,但是如果没有,则您只是试图以一种可以被多种编程语言(以及上的多个Python版本)读取的方式来序列化数据结构。多个平台),则可能需要查看协议缓冲区

You can use os.open + os.write + os.close. 您可以使用os.open + os.write + os.close。 However, these need while loops for reliability. 但是,这些需要while循环以提高可靠性。

I've done this using http://stromberg.dnsalias.org/~strombrg/bufsock.html (I wrote it, my former employer allowed me to opensource it) to achieve this in a deduplicating backup program that does lots of binary I/O. 我已经使用http://stromberg.dnsalias.org/~strombrg/bufsock.html (我写了它,我的前任雇主允许我将其开源)完成了此任务,并在重复数据删除的备份程序中实现了该功能, / O。 Doing it this way doesn't require loops, even if you're employing signals. 这样,即使您正在使用信号,也不需要循环。 Don't let the "bufsock" name fool you - it's also good for file I/O. 不要让“ bufsock”这个名字愚弄您-这对文件I / O也很有用。

BTW, writing stuff in an endian-dependent way is usually a mistake in the long term. 顺便说一句,从长远来看,以依赖于字节序的方式编写东西通常是一个错误。 If you're not content with the builtin Python tools (I'm not always, EG once I needed a 3 byte integer type), it's probablty better to tear apart your numbers using divmod 256. Another alternative is to use http://stromberg.dnsalias.org/~strombrg/base255.html to get null-terminateable strings. 如果您对内置的Python工具不满意(我不总是需要一次3字节整数类型的EG),那么最好使用divmod 256拆分数字。另一种方法是使用http:// stromberg.dnsalias.org/~strombrg/base255.html以获得可终止为null的字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM