简体   繁体   中英

How do I extend a pandas DataFrame by repeating the last row?

I have a DataFrame, and would like to extend it by repeating the last row n times.

Example code:

import pandas as pd
import numpy as np
dates = date_range('1/1/2014', periods=4)
df = pd.DataFrame(np.eye(4, 4), index=dates, columns=['A', 'B', 'C', 'D'])
n = 3
for i in range(n):
    df = df.append(df[-1:])

so df is

            A  B  C  D
2013-01-01  1  0  0  0
2013-01-02  0  1  0  0
2013-01-03  0  0  1  0
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1
2013-01-04  0  0  0  1

Is there a better way to do this without the for loop?

Here's an alternate (fancy indexing) way to do it:

df.append( df.iloc[[-1]*3] )

Out[757]: 
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1

You could use nested concat operations, the inner one will concatenate your last row 3 times and we then concatenate this with your orig df:

In [181]:

dates = pd.date_range('1/1/2014', periods=4)
df = pd.DataFrame(np.eye(4, 4), index=dates, columns=['A', 'B', 'C', 'D'])
pd.concat([df,pd.concat([df[-1:]]*3)])
Out[181]:
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1

This could be put into a function like so:

In [182]:

def repeatRows(d, n=3):
    return pd.concat([d]*n)

pd.concat([df,repeatRows(df[-1:], 3)])
Out[182]:
            A  B  C  D
2014-01-01  1  0  0  0
2014-01-02  0  1  0  0
2014-01-03  0  0  1  0
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1
2014-01-04  0  0  0  1

Another way, without using any index or multiple concat, is by using tail() and the unpack operator . Notice that the method append is deprecated .

pd.concat([df, *[df.tail(1)]*3]) 

Therefore, to repeat the last n rows d times:

pd.concat([df, *[df.tail(n)]*d]) 

tail(n) returns the last n elements (by default n=5).

The unpack operator ('*') allows you to unpack a sequence or iterable into separate variables, for example:

def sum_var(a, b, c):
    return a + b + c

numbers = [1, 2, 3]

sum_result = sum_var(*numbers)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM