I'm trying to combine certain columns by index of this dataframe, which I achived using a simple pd.cov() function, to calculate the variances and covariances of u_centro, v_centro and w_centro.
However, when I try to slice some of theses values using.loc, the performance is very slow (much slow.): For example:
df_uu = df.loc[(iz_centro,'u_centro'),'u_centro']
where I want all the combinations of u_centro by u_centro. The result is exactly what I wanted, but the time spend to complete this is abusurd, more than 10 minutes.
the whole data: https://raw.githubusercontent.com/AlessandroMDO/LargeEddySimulation/master/sd.csv
There are different ways to do this, but the best performance is using vectorization functions like xs
(thanks @Paul H) or boolean masks for example:
startime = datetime.now()
mask = df.index.get_level_values(1) == 'u_centro'
df.loc[mask]
print(datetime.now() - startime) # 0:00:00.001417
I don't know if 1417 µs
are a big deal in this case.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.