简体   繁体   English

有没有更快的方法将大文件从hexa转换为二进制,二进制转换为int?

[英]Is there a faster way to convert big file from hexa to binary and binary to int?

I have a big DataFrame (1999048 rows and 1col), with hexadecimal datas. 我有一个大的DataFrame(1999048行和1col),具有十六进制数据。 I want to put each line in binary, cut it into pieces and traduce each piece in decimal format. 我想把每一行都放在二进制文件中,将它切成碎片并以十进制格式描述每一行。

I tried this: 我试过这个:

for i in range (len(df.index)):
    hexa_line=hex2bin(str(f1.iloc[i]))[::-1] 
    channel = int(hexa_line[0:3][::-1], 2)     
    edge = int(hexa_line[3][::-1], 2)      
    time = int(hexa_line[4:32][::-1], 2)   
    sweep = int(hexa_line[32:48][::-1], 2)  
    tag = int(hexa_line[48:63][::-1], 2)   
    datalost = int(hexa_line[63][::-1], 2)   
    line=np.array([[channel, edge, time, sweep, tag, datalost]])
    tab=np.concatenate((tab, line), axis=0)

But it is really really long.... Is there a faster way to do that ? 但真的很长......有没有更快的方法呢?

only thing I can imagine helping a lot would be changing these lines: 我唯一可以想象的就是改变这些线条:

line=np.array([[channel, edge, time, sweep, tag, datalost]])
tab=np.concatenate((tab, line), axis=0)

certainly in pandas, and I think also in numpy concatting is an expensive thing to do, and depends on the size of the total size of both arrays (rather than, say list.append) 肯定在熊猫,我认为在numpy concatting也是一件昂贵的事情,并且取决于两个数组的总大小(而不是像list.append)

I think what this does is re-writes the entire array tab each time you call it. 我认为这样做是每次调用它时重写整个数组tab Perhaps you could try appending each line to a list then concatting the whole list together. 也许您可以尝试将每一行附加到列表中,然后将整个列表连接在一起。

eg something more like this: 例如更像这样的东西:

tab = []
for i in range (len(df.index)):
    hexa_line=hex2bin(str(f1.iloc[i]))[::-1] 
    channel = int(hexa_line[0:3][::-1], 2)     
    edge = int(hexa_line[3][::-1], 2)      
    time = int(hexa_line[4:32][::-1], 2)   
    sweep = int(hexa_line[32:48][::-1], 2)  
    tag = int(hexa_line[48:63][::-1], 2)   
    datalost = int(hexa_line[63][::-1], 2)   
    line=np.array([[channel, edge, time, sweep, tag, datalost]])
    tab.append(line)

final_tab = np.concatenate(tab, axis=0)
# or whatever the syntax is :p

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM