[英]Two Pandas dataframes: Based on date, add value to dataframe
My first Pandas dataframe looks (exemplarily) as follows - Date
values are of datetime
, Temp in F
values are of float
and Signal
values are of string
.我的第一个 Pandas dataframe 看起来(示例性地)如下 - Date
值是datetime
, Temp in F
值是float
和Signal
值是string
。
Date Temp in F Signal
1 1990-10-01 2.23337 freeze
2 1991-07-31 2.99860 defrost
3 1991-10-01 3.12221 freeze
4 1992-07-31 3.2328 defrost
5 1992-10-01 4.21327 freeze
6 1993-07-29 2.23222 defrost
7 1993-10-01 1.53200 freeze
8 1994-07-29 2.15030 defrost
9 1994-10-03 1.22299 freeze
.. ... ... ...
57 2018-10-01 8.95045 freeze
58 2019-07-31 9.32463 defrist
59 2019-10-01 9.34722 freeze
60 2020-07-31 10.53200 defrost
61 2020-10-01 10.34000 freeze
My second Pandas dataframe looks (exemplarily) as follows - Date
values are of datetime
, Moisture
values are of float.我的第二个 Pandas dataframe 看起来(示例性地)如下 - Date
值是datetime
, Moisture
值是浮动的。
Date Moisture
1994-05-27 4.21232
1995-05-19 3.30000
1996-05-24 3.43227
1997-05-23 3.63333
1998-05-15 4.60000
1999-05-28 2.43240
2000-05-26 1.34237
2001-05-21 1.23430
2002-05-29 2.34343
2003-05-02 1.83433
2004-04-29 2.34341
2005-06-28 3.15373
2006-05-05 1.78565
2007-05-04 0.34533
2008-08-02 0.42267
2009-05-07 0.40000
2010-08-07 0.30000
2011-05-06 2.30000
2012-05-04 3.12300
2013-05-06 4.10200
2014-05-02 2.42000
2015-05-08 2.53300
2016-06-09 1.20000
2017-05-11 1.45000
2018-05-10 1.30000
2019-05-15 1.67230
2020-05-29 2.34000
Now I want to add the values of Moisture
to the first dataframe based on their date value where the Signal
is freeze
.现在我想根据Signal
为freeze
的日期值将Moisture
的值添加到第一个 dataframe 。 To give you examples给你举例
This procedure should be done with all values of the second dataframe's column "Moisture" so that the modified first dataframe should look as follows:此过程应使用第二个数据帧的“水分”列的所有值完成,以便修改后的第一个 dataframe 应如下所示:
Date Temp in F Signal Moisture
1 1990-10-01 2.23337 freeze
2 1991-07-31 2.99860 defrost
3 1991-10-01 3.12221 freeze
4 1992-07-31 3.2328 defrost
5 1992-10-01 4.21327 freeze
6 1993-07-29 2.23222 defrost
7 1993-10-01 1.53200 freeze
8 1994-07-29 2.15030 defrost
9 1994-10-03 1.22299 freeze 4.21232
.. ... ... ...
57 2018-10-01 8.95045 freeze 1.30000
58 2019-07-31 9.32463 defrist
59 2019-10-01 9.34722 freeze 1.67230
60 2020-07-31 10.53200 defrost
61 2020-10-01 10.34000 freeze 2.34000
Anyone an idea how to do solve that?任何人都知道如何解决这个问题?
Use merge_asof()
with subset of required data for join, then add data back in.将merge_asof()
与连接所需数据的子集一起使用,然后重新添加数据。
df1 = pd.read_csv(io.StringIO(""" Date Temp_in_F Signal
1 1990-10-01 2.23337 freeze
2 1991-07-31 2.99860 defrost
3 1991-10-01 3.12221 freeze
4 1992-07-31 3.2328 defrost
5 1992-10-01 4.21327 freeze
6 1993-07-29 2.23222 defrost
7 1993-10-01 1.53200 freeze
8 1994-07-29 2.15030 defrost
9 1994-10-03 1.22299 freeze
.. ... ... ...
57 2018-10-01 8.95045 freeze
58 2019-07-31 9.32463 defrist
59 2019-10-01 9.34722 freeze
60 2020-07-31 10.53200 defrost
61 2020-10-01 10.34000 freeze"""), sep="\s+")
df1["Date"] = pd.to_datetime(df1["Date"],errors="coerce")
df1 = df1.dropna().reset_index(drop=True)
df2 = pd.read_csv(io.StringIO(""" Date Moisture
1994-05-27 4.21232
1995-05-19 3.30000
1996-05-24 3.43227
1997-05-23 3.63333
1998-05-15 4.60000
1999-05-28 2.43240
2000-05-26 1.34237
2001-05-21 1.23430
2002-05-29 2.34343
2003-05-02 1.83433
2004-04-29 2.34341
2005-06-28 3.15373
2006-05-05 1.78565
2007-05-04 0.34533
2008-08-02 0.42267
2009-05-07 0.40000
2010-08-07 0.30000
2011-05-06 2.30000
2012-05-04 3.12300
2013-05-06 4.10200
2014-05-02 2.42000
2015-05-08 2.53300
2016-06-09 1.20000
2017-05-11 1.45000
2018-05-10 1.30000
2019-05-15 1.67230
2020-05-29 2.34000"""), sep="\s+")
df2["Date"] = pd.to_datetime(df2["Date"])
# useful ...
mask = df1["Signal"]=="freeze"
# exclude defrost from merge_asof()
pd.concat([
pd.merge_asof(df1[mask].sort_values("Date"), df2.sort_values("Date"), on="Date")
, df1[~mask] # put defrost back in
]).sort_values("Date").reset_index(drop=True)
Date Temp_in_F Signal Moisture
1990-10-01 2.23337 freeze NaN
1991-07-31 2.99860 defrost NaN
1991-10-01 3.12221 freeze NaN
1992-07-31 3.2328 defrost NaN
1992-10-01 4.21327 freeze NaN
1993-07-29 2.23222 defrost NaN
1993-10-01 1.53200 freeze NaN
1994-07-29 2.15030 defrost NaN
1994-10-03 1.22299 freeze 4.21232
2018-10-01 8.95045 freeze 1.30000
2019-07-31 9.32463 defrist NaN
2019-10-01 9.34722 freeze 1.67230
2020-07-31 10.53200 defrost NaN
2020-10-01 10.34000 freeze 2.34000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.