I have a big DataFrame (1999048 rows and 1col), with hexadecimal datas. I want to put each line in binary, cut it into pieces and traduce each piece in decimal format.
I tried this:
for i in range (len(df.index)):
hexa_line=hex2bin(str(f1.iloc[i]))[::-1]
channel = int(hexa_line[0:3][::-1], 2)
edge = int(hexa_line[3][::-1], 2)
time = int(hexa_line[4:32][::-1], 2)
sweep = int(hexa_line[32:48][::-1], 2)
tag = int(hexa_line[48:63][::-1], 2)
datalost = int(hexa_line[63][::-1], 2)
line=np.array([[channel, edge, time, sweep, tag, datalost]])
tab=np.concatenate((tab, line), axis=0)
But it is really really long.... Is there a faster way to do that ?
only thing I can imagine helping a lot would be changing these lines:
line=np.array([[channel, edge, time, sweep, tag, datalost]])
tab=np.concatenate((tab, line), axis=0)
certainly in pandas, and I think also in numpy concatting is an expensive thing to do, and depends on the size of the total size of both arrays (rather than, say list.append)
I think what this does is re-writes the entire array tab
each time you call it. Perhaps you could try appending each line to a list then concatting the whole list together.
eg something more like this:
tab = []
for i in range (len(df.index)):
hexa_line=hex2bin(str(f1.iloc[i]))[::-1]
channel = int(hexa_line[0:3][::-1], 2)
edge = int(hexa_line[3][::-1], 2)
time = int(hexa_line[4:32][::-1], 2)
sweep = int(hexa_line[32:48][::-1], 2)
tag = int(hexa_line[48:63][::-1], 2)
datalost = int(hexa_line[63][::-1], 2)
line=np.array([[channel, edge, time, sweep, tag, datalost]])
tab.append(line)
final_tab = np.concatenate(tab, axis=0)
# or whatever the syntax is :p
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.