[英]Pandas lookup from same dataframe for criteria then add to right as new column
My goal is to create an excel-vlookup-equivalent in python which takes the value of the past month and adds it to a new column to the current month, ie id, month, value_current_month, value_past_month:我的目标是在 python 中创建一个 excel-vlookup-equivalent,它采用上个月的值并将其添加到当前月份的新列中,即 id、month、value_current_month、value_past_month:
From This:由此:
id month value
01 09 123
02 09 234
03 09 345
01 08 543
02 08 432
03 08 321
01 07 678
02 07 789
03 07 890
.. .. ...
To this:对此:
id month value new
01 09 123 543
02 09 234 432
03 09 345 321
01 08 543 678
02 08 432 789
03 08 321 890
01 07 678 ...
02 07 789 ...
03 07 890 ...
.. .. ... ...
I have imported pandas and numpy and created a dataframe called "df".我已经导入了 pandas 和 numpy 并创建了一个名为“df”的数据框。 As I am unfamiliar with the syntax of python, any help would be greatly appreciated.由于我不熟悉 python 的语法,任何帮助将不胜感激。
Thank you!谢谢!
Date
column (since you will likely have multiple years, you cannot just join on month)正确的方法是创建一个Date
列(因为您可能有多个年份,您不能只在月份加入)+ pd.DateOffset(months=1)
.然后,将数据帧合并回自身,但使用+ pd.DateOffset(months=1)
移动一个月。 and join on Date
and id
:并加入Date
和id
:#sample dataframe setup
import pandas as pd
df = pd.DataFrame({'id': {0: '01',1: '02',2: '03',3: '01',4: '02',5: '03',6: '01',7: '02',8: '03'},
'month': {0: '09',1: '09',2: '09',3: '08',4: '08',5: '08', 6: '07',7: '07',8: '07'},
'value': {0: 123,1: 234,2: 345,3: 543,4: 432,5: 321, 6: 678,7: 789,8: 890}})
df
#solution 1
df['Year'] = '2020'
df['Date'] = pd.to_datetime(df['Year'] + '-' + df['month'])
df = (pd.merge(df, df[['Date', 'value', 'id']].rename({'value' : 'new_value'}, axis=1)
.assign(Date=df['Date'] + pd.DateOffset(months=1)),
how='left', on=['Date' , 'id']).drop('Date', axis=1))
df
Out[1]:
id month value Year new_value
0 1 09 123 2020 543.0
1 2 09 234 2020 432.0
2 3 09 345 2020 321.0
3 1 08 543 2020 678.0
4 2 08 432 2020 789.0
5 3 08 321 2020 890.0
6 1 07 678 2020 NaN
7 2 07 789 2020 NaN
8 3 07 890 2020 NaN
Use .shift(-3)
.使用.shift(-3)
。 if the problem is simple and you have three ID values per month.如果问题很简单并且您每个月有三个 ID 值。 You can change -3
to -12
for example if you have 12 id
values in your actual dataframe per month.例如,如果您每个月的实际数据框中有 12 个id
值,您可以将-3
更改为-12
。 This also assumes you have sorted your dataframe:这也假设您已经对数据框进行了排序:
#solution 2
df['new'] = df['value'].shift(-3)
df
Out[2]:
id month value new
0 1 9 123 543.0
1 2 9 234 432.0
2 3 9 345 321.0
3 1 8 543 678.0
4 2 8 432 789.0
5 3 8 321 890.0
6 1 7 678 NaN
7 2 7 789 NaN
8 3 7 890 NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.