简体   繁体   中英

How to split pandas.DataFrame by index order?

If I have two pandas.DataFrame with the same columns.

df1 = pd.DataFrame(np.random.rand(5, 6), columns=list('abcdef'))
df2 = pd.DataFrame(np.random.rand(5, 6), columns=list('abcdef'))

I concatenate them into one:

df = pd.concat([df1, df2], ignore_index = False)

The index values now are not ignored.

After I perform some data manipulation without changing the index values, how can I reverse back the concatenation, so that I end up with a list of the two data frames again?

I recommend using keys in concat

df = pd.concat([df1, df2], ignore_index = False,keys=['df1','df2'])
df
Out[28]: 
              a         b         c         d         e         f
df1 0  0.426246  0.162134  0.231001  0.645908  0.282457  0.715134
    1  0.973173  0.854198  0.419888  0.617750  0.115466  0.565804
    2  0.474284  0.757242  0.452319  0.046627  0.935915  0.540498
    3  0.046215  0.740778  0.204866  0.047914  0.143158  0.317274
    4  0.311755  0.456133  0.704235  0.255057  0.558791  0.319582
df2 0  0.449926  0.330672  0.830240  0.861221  0.234013  0.299515
    1  0.552645  0.620980  0.313907  0.039247  0.356451  0.849368
    2  0.159485  0.620178  0.428837  0.315384  0.910175  0.020809
    3  0.687249  0.824803  0.118434  0.661684  0.013440  0.611711
    4  0.576244  0.915196  0.544099  0.750581  0.192548  0.477207

Convert back

df1,df2=[y.reset_index(level=0,drop=True) for _, y in df.groupby(level=0)]
df1
Out[30]: 
          a         b         c         d         e         f
0  0.426246  0.162134  0.231001  0.645908  0.282457  0.715134
1  0.973173  0.854198  0.419888  0.617750  0.115466  0.565804
2  0.474284  0.757242  0.452319  0.046627  0.935915  0.540498
3  0.046215  0.740778  0.204866  0.047914  0.143158  0.317274
4  0.311755  0.456133  0.704235  0.255057  0.558791  0.319582

If you prefer to do without groupby, you could use this.

list_dfs = [df1, df2]
df = pd.concat(list_dfs, ignore_index = False)

new_dfs = []
counter = 0
for i in list_dfs:
    new_dfs.append(df[counter:counter+len(i)])
    counter += len(i)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM