简体   繁体   中英

Python pandas dataframe shorten the conversion time from hex string to int

My intention is to convert the whole dataframe from hex string to int. Currently I able to do it based on the answer provided at pandas dataframe.apply -- converting hex string to int number

df = df.apply(lambda x: x.astype(str).map(lambda x: int(x, base=16)))

However, it runs very slow especially when the dataframe is big. I saw an answer from https://stackoverflow.com/a/52855646/5057185 saying that the lambda isn't necessary and adds overhead. I tried to implement it but I got this error.

df2 = pd.read_csv(path+temp_file, dtype=str)
df2 = df2.dropna()
df2 = df2.apply(int,base=16)

df2 = df2.apply(int,base=16) Traceback (most recent call last): File "", line 1, in File "C:\\Python27\\lib\\site-packages\\pandas\\core\\frame.py", line 6487, in apply return op.get_result() File "C:\\Python27\\lib\\site-packages\\pandas\\core\\apply.py", line 151, in get_result return self.apply_standard() File "C:\\Python27\\lib\\site-packages\\pandas\\core\\apply.py", line 257, in apply_standard self.apply_series_generator() File "C:\\Python27\\lib\\site-packages\\pandas\\core\\apply.py", line 286, in apply_series_generator results[i] = self.f(v) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\apply.py", line 78, in f return func(x, *args, **kwds) TypeError: ("int() can't convert non-string with explicit base", u'occurred at index POWERON')

I believe this error is due to the dtype of the dataframe is object instead of string and this problem is known and solved in the newer version of pandas, pd.read_csv(path+temp_file, dtype="string"). I am using the old version of pandas. How can I workaround this or any other method to convert dataframe faster?

I think you need DataFrame.applymap for elementwise processing:

df2 = df2.applymap(lambda x: int(x,base=16))

Another idea is reshape by DataFrame.stack and Series.unstack :

df2 = df2.stack().apply(lambda x: int(x, 16)).unstack()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM