简体   繁体   English

复制数据行,然后在带有熊猫的新列中添加一系列日期

[英]duplicating rows of data and then adding a series of dates in a new column with pandas

I have a dataset in the below format however the data set is a lot bigger:我有以下格式的数据集,但数据集要大得多:

import pandas as pd
df1 = pd.DataFrame({'From': ['RU','USA','ME'],
               'To': ['JK', 'JK', 'JK'],
               'Distance':[ 40000,30000,20000],
               'Days': [8,6,4]})

I want to add a date range to each location:我想为每个位置添加一个日期范围:

date_rng = pd.date_range(start='01/02/2020', freq='MS', periods=3)

The end result should look like this:最终结果应如下所示:

df2 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
               'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
               'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
               'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
               'Days': [8,8,8,6,6,6,4,4,4]})

Perhaps this will do what you need:也许这会做你需要的:

l = []
for d in date_rng:
    df1['Date'] = d
    l.append(df1.copy())

pd.concat(l)

This should do the trick:这应该可以解决问题:

df1["Date"]=pd.Series({df1.index[0]: date_rng.to_list()})
df1["Date"]=df1["Date"].ffill()
df1=df1.explode("Date")

Output:输出:

  From  To  Distance  Days       Date
0   RU  JK     40000     8 2020-02-01
0   RU  JK     40000     8 2020-03-01
0   RU  JK     40000     8 2020-04-01
1  USA  JK     30000     6 2020-02-01
1  USA  JK     30000     6 2020-03-01
1  USA  JK     30000     6 2020-04-01
2   ME  JK     20000     4 2020-02-01
2   ME  JK     20000     4 2020-03-01
2   ME  JK     20000     4 2020-04-01

Ref:参考:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.explode.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.explode.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM