I have large data splitted into 5 dataframes with exactly same rows. Just wondering, if there any efficient way to merge the pivot tables and to process them in parallel.
Process that would like to do is :
df1 --> df1_pivot ---> Merge(df1_pivot, df2_pivot) ---> df1_df2_pivot
df1 --> df2_pivot
Goal is to process dataframe in parallel and merge them. (using multi-processing).
EDIT: Pivot can be multi-index, like this.
pd.pivot_table(df1, index= ['col4', 'col3' ], columns=[ 'col1', 'col2' ],
values='val_tosum', aggfunc='sum' )
使用pd.concat
pd.concat([df1, df2, df3, df4, df5], axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.