[英]Is there a way to filter a dataframe based on a specific value but also keep all other values for the unique identifier using pandas?
What I mean is that let's say we have the following dataframe:我的意思是,假设我们有以下 dataframe:
UID A
1 Yes
1 No
2 No
2 No
3 Yes
3 No
4 Yes
4 Yes
I want to produce a dataframe where any UID that has a yes is included, even the other instances of the UID is a "No".我想生成一个 dataframe ,其中包含任何具有是的 UID,即使 UID 的其他实例也是“否”。
UID A
1 Yes
1 No
3 Yes
3 No
4 Yes
4 Yes
Is there a way to do this using Pandas or any other library in python?有没有办法使用 Pandas 或 python 中的任何其他库来做到这一点?
Try with isin
试试isin
df = df.loc[df.UID.isin(df.loc[df.A=='Yes','UID'])]
df
Out[323]:
UID A
0 1 Yes
1 1 No
4 3 Yes
5 3 No
6 4 Yes
7 4 Yes
I would use a groupby + filter operation:我会使用 groupby + filter 操作:
result = (
df.groupby('UID')
.filter(lambda g: g['A'].eq('Yes').any()
)
And that gives me:这给了我:
UID A
0 1 Yes
1 1 No
4 3 Yes
5 3 No
6 4 Yes
7 4 Yes
Try尝试
list = [x for x in df['UID'] if df['A'] == 'Yes']
dfnew = df.query('UID == list & A == ("Yes","No")')
Not elegant but worth a try不优雅但值得一试
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.