简体   繁体   中英

duplicating rows of data and then adding a series of dates in a new column with pandas

I have a dataset in the below format however the data set is a lot bigger:

import pandas as pd
df1 = pd.DataFrame({'From': ['RU','USA','ME'],
               'To': ['JK', 'JK', 'JK'],
               'Distance':[ 40000,30000,20000],
               'Days': [8,6,4]})

I want to add a date range to each location:

date_rng = pd.date_range(start='01/02/2020', freq='MS', periods=3)

The end result should look like this:

df2 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
               'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
               'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
               'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
               'Days': [8,8,8,6,6,6,4,4,4]})

Perhaps this will do what you need:

l = []
for d in date_rng:
    df1['Date'] = d
    l.append(df1.copy())

pd.concat(l)

This should do the trick:

df1["Date"]=pd.Series({df1.index[0]: date_rng.to_list()})
df1["Date"]=df1["Date"].ffill()
df1=df1.explode("Date")

Output:

  From  To  Distance  Days       Date
0   RU  JK     40000     8 2020-02-01
0   RU  JK     40000     8 2020-03-01
0   RU  JK     40000     8 2020-04-01
1  USA  JK     30000     6 2020-02-01
1  USA  JK     30000     6 2020-03-01
1  USA  JK     30000     6 2020-04-01
2   ME  JK     20000     4 2020-02-01
2   ME  JK     20000     4 2020-03-01
2   ME  JK     20000     4 2020-04-01

Ref:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.explode.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM