简体   繁体   中英

Most efficient way to calculate CRC16 in Python

I need to calculate CRC16 of binary numbers several times inside a loop. I have used the following method

import numpy as np
import binascii
#I have just filled the array with random numbers
#These arrays are loaded from a file
array1=np.random.randint(0,511, size=100000)
array2=np.random.randint(0,511, size=100000)
#...
#This goes on to till say array100
#Now calculate crc of each row in a loop
for j in range(100000):
    crc=0xffff
    #Convert the number to binary 16 bit format
    temp_bin=np.binary_repr(array1[j], 16)
    crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
    crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
    #Similarly for array2
    temp_bin=np.binary_repr(array2[j], 16)
    crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
    crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
    #...
    #This goes on till array100

While this method works perfectly, it is extremely slow. On profiling, I find converting each number to binary is the major bottleneck in my code.

Total time: 10.9712 s

File: speedup.py

Function: abc at line 7

Line _____ Hits ____ Time ____Per Hit ____ % Time ____ Line Contents

 7                                           @profile
 8                                           def abc():
 9                                               #I have just filled the array with random numbers
10                                               #Thse arrays are loaded from a file
11         1       3269.0   3269.0      0.0      array1=np.random.randint(0,511, size=100000)
12         1       3206.0   3206.0      0.0      array2=np.random.randint(0,511, size=100000)
13                                               #...
14                                               #This goes on to till say array100
15                                               #Now calculate crc of each row in a loop
16    100001     237461.0      2.4      2.2      for j in range(100000):
17    100000     199887.0      2.0      1.8          crc=0xffff
18                                                   #Convert the number to binary 16 bit format
19    100000    3436116.0     34.4     31.3          temp_bin=np.binary_repr(array1[j], 16)
20    100000    1039049.0     10.4      9.5          crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
21    100000     793751.0      7.9      7.2          crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
22                                                   ##Similarly for array2
23    100000    3423862.0     34.2     31.2          temp_bin=np.binary_repr(array2[j], 16)
24    100000     991331.0      9.9      9.0          crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
25    100000     843271.0      8.4      7.7          crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)

I have not been able to come up with an alternative solution that avoids it. So is there a more efficient and pythonic way to convert numbers to binary or to do this entire thing?

Looking at your code, you can probably bypass sending the interger to string and back. Especially since you are padding a 8-bit binary array to 16 bits with zeros, only to split it in half again. Instead, try:

zb = np.zeros(1, dtype=np.uint8)[0].tobytes()
for j in range(100000):
    crc=0xffff
    tmp_data = array1[j].tobytes()
    crc=binascii.crc_hqx(zb, crc)
    crc=binascii.crc_hqx(tmp_data, crc)

    tmp_data = array2[j].tobytes()
    crc=binascii.crc_hqx(zb, crc)
    crc=binascii.crc_hqx(tmp_data, crc)

Finally I found out a faster way. Instead of converting the number to binary in the first place we can just use bit operators cleverly. This implementation is about thrice as fast.

import numpy as np
import binascii
#I have just filled the array with random numbers
#These arrays are loaded from a file
array1=np.random.randint(0,511, size=100000)
array2=np.random.randint(0,511, size=100000)
#...
#This goes on to till say array100
#Now calculate crc of each row in a loop
for j in range(100000):
    crc=0xffff
    #Convert the number to binary 16 bit format
    crc=binascii.crc_hqx(chr(array1[j] >> 8), crc)
    crc=binascii.crc_hqx(chr(array1[j] & 255), crc)
    #Similarly for array2
    crc=binascii.crc_hqx(chr(array2[j] >> 8), crc)
    crc=binascii.crc_hqx(chr(array2[j] & 255), crc)
    #...
    #This goes on till array100

Comparison using line profiler shows that this method calculating CRC is more than thrice as fast:

Total time: 2.66351 s

File: speedup1.py

Function: abc at line 4

Line__ Hits __ Time __Per Hit _ % Time ____ Line Contents

 4                                           @profile
 5                                           def abc():
 6                                               #I have just filled the array with random numbers
 7                                               #These arrays are loaded from a file
 8         1       1204.0   1204.0      0.0      array1=np.random.randint(0,511, size=100000)
 9         1       1207.0   1207.0      0.0      array2=np.random.randint(0,511, size=100000)
10                                               #...
11                                               #This goes on to till say array100
12                                               #Now calculate crc of each row in a loop
13    100001      93020.0      0.9      3.5      for j in range(100000):
14    100000      83277.0      0.8      3.1          crc=0xffff
15                                                   #Convert the number to binary 16 bit format(This is the old method)
16    100000    1280059.0     12.8     48.1          temp_bin=np.binary_repr(array1[j], 16)
17    100000     351190.0      3.5     13.2          crc=binascii.crc_hqx(chr(array1[j] >> 8), crc)
18    100000     299711.0      3.0     11.3          crc=binascii.crc_hqx(chr(array1[j] & 255), crc)
19                                                   #Similarly for array2(This is the new method using bit operators)
20    100000     276893.0      2.8     10.4          crc=binascii.crc_hqx(chr(array2[j] >> 8), crc)
21    100000     276946.0      2.8     10.4          crc=binascii.crc_hqx(chr(array2[j] & 255), crc)

Use crcmod . It will generate efficient code for the specified CRC.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM