[英]How to get new column in dataframe that is based on multiple conditions between two dataframes?
I have two dataframes and I am looking to get a column in DF1 that will have values of the "current date" column plus number of days relating the the relevant status and technology in DF2.我有两个数据框,我希望在 DF1 中获得一个列,该列将具有“当前日期”列的值加上与 DF2 中的相关状态和技术相关的天数。 For example in the below the first value in the "new date" column is 18/03/2022 + 1095 days as it is checking to see if technology = wind and status = construction.
例如,在下面的“新日期”列中的第一个值是 18/03/2022 + 1095 天,因为它正在检查技术是否 = 风和状态 = 施工。
DF 1东风1
Current Date![]() |
Technology![]() |
Status![]() |
New Date DESIRED FROM CODE![]() |
---|---|---|---|
18/03/2022 ![]() |
Wind![]() |
Construction![]() |
16/12/2022 ![]() |
15/02/2022 ![]() |
Solar![]() |
Construction![]() |
15/11/2022 ![]() |
24/01/2022 ![]() |
Battery![]() |
Application approved![]() |
24/10/2022 ![]() |
23/09/2020 ![]() |
Wind![]() |
Application approved![]() |
24/03/2023 ![]() |
18/11/2021 ![]() |
Solar![]() |
Application submitted![]() |
18/11/2023 ![]() |
25/06/2020 ![]() |
Solar![]() |
Application approved![]() |
25/03/2021 ![]() |
27/02/2020 ![]() |
Wind![]() |
Application submitted![]() |
25/02/2025 ![]() |
10/03/2022 ![]() |
Battery![]() |
Application submitted![]() |
09/03/2024 ![]() |
DF 2东风2
Technology![]() |
Application submitted![]() |
Application approved![]() |
Construction![]() |
---|---|---|---|
Battery![]() |
730 ![]() |
273.75 ![]() |
273.75 ![]() |
Solar Photovoltaics![]() |
730 ![]() |
273.75 ![]() |
273.75 ![]() |
Wind![]() |
1825 ![]() |
912.5 ![]() |
1095 ![]() |
Use DataFrame.melt
with convert values to timedeltas by to_timedelta
(if need better accuracy remove .astype(int)
):使用
DataFrame.melt
并通过to_timedelta
将值转换为时间增量(如果需要更高的准确性,请删除.astype(int)
):
df2 = (df2.melt('Technology', var_name='Status', value_name='New Date')
.assign(**{'New Date':
lambda x: pd.to_timedelta(x['New Date'].astype(int), unit='d')}))
print (df2)
Technology Status New Date
0 Battery Application submitted 730 days
1 Solar Photovoltaics Application submitted 730 days
2 Wind Application submitted 1825 days
3 Battery Application approved 273 days
4 Solar Photovoltaics Application approved 273 days
5 Wind Application approved 912 days
6 Battery Construction 273 days
7 Solar Photovoltaics Construction 273 days
8 Wind Construction 1095 days
And then use left join and add column Current Date
:然后使用 left join 并添加
Current Date
列:
df = df1.merge(df2, on=['Technology','Status'], how='left')
df['New Date'] += pd.to_datetime(df['Current Date'], dayfirst=True)
print (df)
Current Date Technology Status New Date
0 18/03/2022 Wind Construction 2025-03-17
1 15/02/2022 Solar Construction NaT
2 24/01/2022 Battery Application approved 2022-10-24
3 23/09/2020 Wind Application approved 2023-03-24
4 18/11/2021 Solar Application submitted NaT
5 25/06/2020 Solar Application approved NaT
6 27/02/2020 Wind Application submitted 2025-02-25
7 10/03/2022 Battery Application submitted 2024-03-09
For match Solar Photovoltaics
values is possible use split and select first values:为了匹配
Solar Photovoltaics
值,可以使用拆分并选择第一个值:
df2['Technology'] = df2['Technology'].str.split().str[0]
df2 = (df2.melt('Technology', var_name='Status', value_name='New Date')
.assign(**{'New Date':
lambda x: pd.to_timedelta(x['New Date'].astype(int), unit='d')}))
print (df2)
Technology Status New Date
0 Battery Application submitted 730 days
1 Solar Application submitted 730 days
2 Wind Application submitted 1825 days
3 Battery Application approved 273 days
4 Solar Application approved 273 days
5 Wind Application approved 912 days
6 Battery Construction 273 days
7 Solar Construction 273 days
8 Wind Construction 1095 days
df = df1.merge(df2, on=['Technology','Status'], how='left')
df['New Date'] += pd.to_datetime(df['Current Date'], dayfirst=True)
print (df)
Current Date Technology Status New Date
0 18/03/2022 Wind Construction 2025-03-17
1 15/02/2022 Solar Construction 2022-11-15
2 24/01/2022 Battery Application approved 2022-10-24
3 23/09/2020 Wind Application approved 2023-03-24
4 18/11/2021 Solar Application submitted 2023-11-18
5 25/06/2020 Solar Application approved 2021-03-25
6 27/02/2020 Wind Application submitted 2025-02-25
7 10/03/2022 Battery Application submitted 2024-03-09
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.