[英]Filling new column in a dataframe based on 3 values in df1 matching 3 values in df2
[英]Filling values in a new df column based on values in another df
我有两个数据框:首先:
Job = {'Name': ["Ron", "Joe", "Dan"],
'Job': [[2000, 2001], 1998, [2000, 1999]]
}
df = pd.DataFrame(Job, columns = ['Name', 'Job'])
Name Job
0 Ron [2000, 2001]
1 Joe 1998
2 Dan [2000, 1999]
第二:
Empty = {'Name': ["Ron", "Ron", "Ron", "Ron", "Joe", "Joe", "Joe", "Joe", "Dan", "Dan", "Dan", "Dan"],
'Year': [1998, 1999, 2000, 2001, 1998, 1999, 2000, 2001, 1998, 1999, 2000, 2001]
}
df2 = pd.DataFrame(Empty, columns = ['Name', 'Year'])
Name Year
0 Ron 1998
1 Ron 1999
2 Ron 2000
3 Ron 2001
4 Joe 1998
5 Joe 1999
6 Joe 2000
7 Joe 2001
8 Dan 1998
9 Dan 1999
10 Dan 2000
11 Dan 2001
我想向 df2 添加一列(我们称之为“job_status”),其中与 df1 中的名称关联的每一年将在 df2 中收到 1,否则为 0。 这应该是输出:
Name Year job_status
0 Ron 1998 0
1 Ron 1999 0
2 Ron 2000 1
3 Ron 2001 1
4 Joe 1998 1
5 Joe 1999 0
6 Joe 2000 0
7 Joe 2001 0
8 Dan 1998 0
9 Dan 1999 1
10 Dan 2000 1
11 Dan 2001 0
我怎样才能做到这一点?
第一explode
数据帧df
上Job
,然后用左侧合并它df2
,最后用Series.notna
+ view
分配标签从[0, 1]
至job_status
:
d = df2.merge(df.explode('Job'), left_on=['Name', 'Year'], right_on=['Name', 'Job'], how='left')
d['job_status'] = d.pop('Job').notna().view('i1')
结果:
print(d)
Name Year job_status
0 Ron 1998 0
1 Ron 1999 0
2 Ron 2000 1
3 Ron 2001 1
4 Joe 1998 1
5 Joe 1999 0
6 Joe 2000 0
7 Joe 2001 0
8 Dan 1998 0
9 Dan 1999 1
10 Dan 2000 1
11 Dan 2001 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.