簡體   English   中英

如何根據列從另一個數據集中復制粘貼值

[英]How to copy paste values from another dataset conditional on a column

我有df1

    Id Data    Group_Id
0    1 A         1
1    2 B         2
2    3 B         3
      ...
100  4 A         101
101  5 A         102
      ...

和 df2

      Timestamp           Group_Id
2012-01-01 00:00:05.523    1
2013-07-01 00:00:10.757    2
2014-01-12 00:00:15.507.   3
                   ...
2016-03-05 00:00:05.743    101
2017-12-24 00:00:10.407    102
                   ...

我想通過Group_Id匹配2個數據集,然后僅從 df2 中的Timestamp復制date並根據相應的Group_Id粘貼到 df1 中的新列,將列命名為day1

然后我想在day1旁邊再添加6 列,將它們命名為day2 ,..., day7 ,接下來的六天基於 day1。 所以它看起來像:

    Id Data    Group_Id    day1    day2       day3        ...    day7  
0    1 A         1      2012-01-01 2012-01-02 2012-01-03         ...
1    2 B         2      2013-07-01 2013-07-02 2013-07-03         ...
2    3 B         3      2014-01-12 2014-01-13 2014-01-14         ...
                              ...
100  4 A         101    2016-03-05 2016-03-06 2016-03-07         ...
101  5 A         102    2017-12-24 2017-12-25 2017-12-26         ...
                              ...

謝謝。

首先我們需要在這里merge

df1=df1.merge(df2,how='left')
s=pd.DataFrame([pd.date_range(x,periods=6,freq ='D') for x in df1.Timestamp],index=df1.index)
s.columns+=1
df1.join(s.add_prefix('Day'))

這里的另一種方法,基本上只是合並 dfs,從時間戳中獲取日期並每次添加 6 個新列:

import pandas as pd
df1 = pd.read_csv('df1.csv')
df2 = pd.read_csv('df2.csv')
df3 = df1.merge(df2, on='Group_Id')

df3['Timestamp'] = pd.to_datetime(df3['Timestamp']) #only necessary if not already timestamp
df3['day1'] = df3['Timestamp'].dt.date

for i in (range(1,7)):
    df3['day'+str(i+1)] = df3['day1'] + pd.Timedelta(i,unit='d')

output:

   Id Data  Group_Id               Timestamp        day1        day2        day3        day4        day5        day6        day7
0   1    A         1 2012-01-01 00:00:05.523  2012-01-01  2012-01-02  2012-01-03  2012-01-04  2012-01-05  2012-01-06  2012-01-07
1   2    B         2 2013-07-01 00:00:10.757  2013-07-01  2013-07-02  2013-07-03  2013-07-04  2013-07-05  2013-07-06  2013-07-07
2   3    B         3 2014-01-12 00:00:15.507  2014-01-12  2014-01-13  2014-01-14  2014-01-15  2014-01-16  2014-01-17  2014-01-18
3   4    A       101 2016-03-05 00:00:05.743  2016-03-05  2016-03-06  2016-03-07  2016-03-08  2016-03-09  2016-03-10  2016-03-11
4   5    A       102 2017-12-24 00:00:10.407  2017-12-24  2017-12-25  2017-12-26  2017-12-27  2017-12-28  2017-12-29  2017-12-30

請注意,我將您的數據框復制到 csv 並且只有 5 個整體,因此索引與您的示例不同(即 100、101)

如果不需要,您可以刪除時間戳 col

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM