简体   繁体   中英

How do I expand a pandas data frame such that each row becomes all previous rows?

My goal is to be able to cumulatively add rows for each group in the data frame as I have done manually below but without using a for loop or df.apply() (So basically one operation).

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]]),
                   columns=['group', 'a', 'b'])

df2 = pd.DataFrame(np.array([[1, 1, 1], [2, 1, 1], [2, 2, 2], [3, 1, 1], [3, 1, 1], [3, 2, 2]]),
                   columns=['group', 'a', 'b'])

df1 = df1.set_index('group').sort_index()
df2 = df2.set_index('group').sort_index()

print(df1)

       a  b
group      
1      1  1
2      2  2
3      3  3

print(df2)

       a  b
group      
1      1  1
2      1  1
2      2  2
3      1  1
3      1  1
3      2  2

IIUC, you can use:

tmp = pd.DataFrame(1, columns=df1.columns, index=df1.index.repeat(range(len(df1))))
df2 = pd.concat([tmp, df1]).sort_index()
print(df2)

# Output
       a  b
group      
1      1  1
2      1  1
2      2  2
3      1  1
3      1  1
3      3  3

One line:

df2 = pd.concat([pd.DataFrame(1, columns=df1.columns, index=df1.index.repeat(range(len(df1)))), df1]).sort_index()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM