在python中将浮点数列表打包成字节的最快方法

Question

I have a list of say 100k floats and I want to convert it into a bytes buffer.我有一个 100k 浮点数的列表，我想将其转换为字节缓冲区。

buf = bytes()
for val in floatList:
   buf += struct.pack('f', val)
return buf

This is quite slow.这是相当缓慢的。 How can I make it faster using only standard Python 3.x libraries.如何仅使用标准 Python 3.x 库使其更快。

Answer 1

Just tell struct how many float s you have.只需告诉struct你有多少个float 。 100k floats takes about a 1/100th of a second on my slow laptop.在我的慢速笔记本电脑上，100k 浮点数大约需要 1/100 秒。

import random
import struct

floatlist = [random.random() for _ in range(10**5)]
buf = struct.pack('%sf' % len(floatlist), *floatlist)

Answer 2

You can use ctypes, and have a double-array (or float array) exactly as you'd have in C , instead of keeping your data in a list.您可以使用 ctypes，并像在 C 中一样使用双数组（或浮点数组），而不是将数据保存在列表中。 This is fair low level, but is a recommendation if you need great performance and if your list is of a fixed size.这是相当低的级别，但如果您需要出色的性能并且您的列表大小固定，这是一个建议。

You can create the equivalent of a C double array[100];您可以创建等效的 C double array[100]; in Python by doing:在 Python 中执行以下操作：

array = (ctypes.c_double * 100)()

The ctypes.c_double * 100 expression yields a Python class for an array of doubles, 100 items long. ctypes.c_double * 100表达式为双精度数组生成一个 Python 类，长度为 100 项。 To wire it to a file, you can just use buffer to get its contents:要将其连接到文件，您只需使用buffer即可获取其内容：

>>> f = open("bla.dat", "wb")
>>> f.write(buffer(array))

If your data is already in a Python list, packing it into a double array may or may not be faster than calling struct as in Agf's accepted answer - I will leave measuring which is faster as homework, but all the code you need is this:如果您的数据已经在 Python 列表中，那么将其打包到双数组中可能比在 Agf 接受的答案中调用struct快，也可能不快 - 我将测量哪个更快作为作业，但您需要的所有代码是：

>>> import ctypes
>>> array = (ctypes.c_double * len(floatlist))(*floatlist)

To see it as a string, just do: str(buffer(array)) - the one drawback here is that you have to take care of float size (float vs double) and CPU dependent float type - the struct module can take care of this for you.要将其视为字符串，只需执行以下操作： str(buffer(array)) - 这里的一个缺点是您必须处理浮点大小（浮点与双精度）和依赖于 CPU 的浮点类型 - struct 模块可以处理这给你。

The big win is that with a float array you can still use the elements as numbers, by accessing then just as if it where a plain Python list, while having then readily available as a planar memory region with buffer .最大的好处是，使用浮点数组，您仍然可以将元素用作数字，就像访问普通 Python 列表一样访问 then ，同时可以随时用作带有buffer的平面内存区域。

Answer 3

A couple of answers suggest几个答案建议

import struct
buf = struct.pack(f'{len(floatlist)}f', *floatlist)

but the use of ' * ' needlessly converts floatlist to a tuple before passing it to struct.pack .但是在将它传递给struct.pack之前，使用 ' * ' 不必要地将floatlist转换为元组。 It's faster to avoid that, by first creating an empty buffer, and then populating it using slice assignment:避免这种情况会更快，首先创建一个空缓冲区，然后使用切片分配填充它：

import ctypes
buf = (ctypes.c_double * len(floatlist))()
buf[:] = floatlist

Other performance savings some people might be able to use:有些人可能可以使用其他性能节省：

You can reuse an existing buffer by just doing the assignment again, without having to create a new buffer.您只需再次执行分配即可重用现有缓冲区，而无需创建新缓冲区。
You can modify parts of an existing buffer by assigning to the appropriate slice.您可以通过分配给适当的切片来修改现有缓冲区的一部分。

Answer 4

For array of single precision float there are two options: to use struct or array .对于单精度浮点数组，有两种选择：使用struct或array 。

In[103]: import random
import struct
from array import array

floatlist = [random.random() for _ in range(10**5)]

In[104]: %timeit struct.pack('%sf' % len(floatlist), *floatlist)
100 loops, best of 3: 2.86 ms per loop

In[105]: %timeit array('f', floatlist).tostring()
100 loops, best of 3: 4.11 ms per loop

So struct is faster.所以struct更快。

Answer 5

那应该工作：

return struct.pack('f' * len(floatList), *floatList)

Answer 6

As with strings, using .join() will be faster than continually concatenating.与字符串一样，使用.join()将比连续连接更快。 Eg:例如：

import struct
b = bytes()
floatList = [5.4, 3.5, 7.3, 6.8, 4.6]
b = b.join((struct.pack('f', val) for val in floatList))

Results in:结果是：

b'\xcd\xcc\xac@\x00\x00`@\x9a\x99\xe9@\x9a\x99\xd9@33\x93@'

Answer 7

As you say that you really do want single-precision 'f' floats, you might like to try the array module (in the the standard library since 1.x).正如您所说，您确实想要单精度“f”浮点数，您可能想尝试使用array 模块（在 1.x 之后的标准库中）。

>>> mylist = []
>>> import array
>>> myarray = array.array('f')
>>> for guff in [123.45, -987.654, 1.23e-20]:
...    mylist.append(guff)
...    myarray.append(guff)
...
>>> mylist
[123.45, -987.654, 1.23e-20]
>>> myarray
array('f', [123.44999694824219, -987.6539916992188, 1.2299999609665927e-20])
>>> import struct
>>> mylistb = struct.pack(str(len(mylist)) + 'f', *mylist)
>>> myarrayb = myarray.tobytes()
>>> myarrayb == mylistb
True
>>> myarrayb
b'f\xe6\xf6B\xdb\xe9v\xc4&Wh\x1e'

This can save you a bag-load of memory, while still having a variable-length container with most of the list methods.这可以为您节省大量内存，同时仍然具有包含大多数列表方法的可变长度容器。 The array.array approach takes 4 bytes per single-precision float. array.array 方法每个单精度浮点数占用 4 个字节。 The list approach consumes a pointer to a Python float object (4 or 8 bytes) plus the size of that object;列表方法使用一个指向 Python 浮点对象（4 或 8 个字节）加上该对象大小的指针； on a 32-bit CPython implementation, that is 16:在 32 位 CPython 实现上，即 16：

>>> import sys
>>> sys.getsizeof(123.456)
16

Total: 20 bytes per item best case for a list , 4 bytes per item always for an array.array('f') .总计：对于list ，每项最佳情况为 20 个字节，对于array.array('f') ， array.array('f')始终为 4 个字节。

Answer 8

In my opinion the best way is to create a cycle:在我看来，最好的方法是创建一个循环：

eg例如

import struct 
file_i="test.txt"
fd_out= open ("test_bin_file",'wb')
b = bytes()
f_i = open(file_i, 'r')
for riga in file(file_i):
     line = riga
     print i,float(line)
     i+=1
     b=struct.pack('f',float(line))
     fd_out.write(b)
     fd_out.flush()


fd_out.close()

To append to an existing file use instead:要附加到现有文件，请改用：

fd_out= open ("test_bin_file",'ab')

Answer 9

Most of the slowness will be that you're repeatedly appending to a bytestring.大多数缓慢将是您反复附加到字节串。 That copies the bytestring each time.每次都复制字节串。 Instead, you should use b''.join() :相反，您应该使用b''.join() ：

import struct
packed = [struct.pack('f', val) for val in floatList]
return b''.join(packed)

在python中将浮点数列表打包成字节的最快方法

问题描述

9 个解决方案

解决方案1
55 已采纳 2012-03-30 10:13:05

解决方案2
9 2012-03-30 15:12:54

解决方案3
5 2019-03-09 20:42:30

解决方案4
2 2015-12-15 13:43:40

解决方案5
2 2012-03-30 10:42:59

解决方案6
1 2012-03-30 10:13:18

解决方案7
0 2012-03-30 21:07:08

解决方案8
-1 2018-09-25 15:41:05

解决方案9
-1 2012-03-30 10:13:09

在python中将浮点数列表打包成字节的最快方法

问题描述

9 个解决方案

解决方案1 55 已采纳 2012-03-30 10:13:05

解决方案2 9 2012-03-30 15:12:54

解决方案3 5 2019-03-09 20:42:30

解决方案4 2 2015-12-15 13:43:40

解决方案5 2 2012-03-30 10:42:59

解决方案6 1 2012-03-30 10:13:18

解决方案7 0 2012-03-30 21:07:08

解决方案8 -1 2018-09-25 15:41:05

解决方案9 -1 2012-03-30 10:13:09

解决方案1
55 已采纳 2012-03-30 10:13:05

解决方案2
9 2012-03-30 15:12:54

解决方案3
5 2019-03-09 20:42:30

解决方案4
2 2015-12-15 13:43:40

解决方案5
2 2012-03-30 10:42:59

解决方案6
1 2012-03-30 10:13:18

解决方案7
0 2012-03-30 21:07:08

解决方案8
-1 2018-09-25 15:41:05

解决方案9
-1 2012-03-30 10:13:09