简体   繁体   English

如果它们在列中具有相同的值并且其中至少一个包含另一列中的字符串,如何保留多行

[英]How to keep multiple rows if they have the same value within a column AND at least one of them contains a string in another column

I have searched multiple threads and cannot seem to figure this out.我搜索了多个线程,似乎无法弄清楚这一点。 Any help would be appreciated.任何帮助,将不胜感激。 Suppose I have the following data set (I have simplified it for the sake of this question)假设我有以下数据集(为了这个问题,我已经简化了它)

在此处输入图像描述

I want to group together all rows that contain the same value in COL1 then search in COL2 for the string "red" for those specific rows.我想将所有在 COL1 中包含相同值的行组合在一起,然后在 COL2 中搜索那些特定行的字符串“red”。 If at least one of the rows in that group contains "red", then I want to keep all of those rows.如果该组中至少有一行包含“红色”,那么我想保留所有这些行。 Thus, for this dataset, the output should look like this:因此,对于这个数据集,output 应该如下所示:

在此处输入图像描述

Any help would be greatly appreciated.任何帮助将不胜感激。 I am working in python.我在 python 工作。 Thank you!谢谢!

df[df['col1'].isin(df[df['col2'] == 'red']['col1'])]


    col1    col2
0   1   red
1   1   yellow
2   1   green
7   3   red
8   3   pink
9   3   green

Do you mean 'red' has value 1 and 3 in Col1, therefore you would like to keep all rows with value 1 and 3 in Col1?您的意思是“红色”在 Col1 中的值为 1 和 3,因此您希望在 Col1 中保留所有值为 1 和 3 的行吗? You can try this:你可以试试这个:

df[df['Col1'].isin(df['Col1'][df['Col2']=='red'])]

To explain, I'm using a filter to extract the relevant rows:为了解释,我使用过滤器来提取相关行:

filter = df['Col1'][df['Col2']=='red']
df1 = df[df['Col1'].isin(filter)]
print(df1)

Output Output

  Col1    Col2
0    1     red
1    1  yellow
2    1   green
6    3     red
7    3    pink
8    3   green

Using Groupby and checking if any rows in COL2 in group have color red使用 Groupby 并检查组中 COL2 中的任何行是否为红色

df[df.groupby("COL1").COL2.transform(lambda x: x.eq("red").any())]

Output Output

    COL1    COL2
0   1   red
1   1   yellow
2   1   green
7   3   red
8   3   pink
9   3   green

Explanation解释

mask = df.groupby("COL1").COL2.transform(lambda x: x.eq("red").any())

mask is True if any items in group in COL2 have color red如果 COL2 中组中的任何项目具有红色,则掩码为 True

0     True
1     True
2     True
3    False
4    False
5    False
6    False
7     True
8     True
9     True

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Python Dataframe 中删除包含 column1 中另一个特定 column2 上至少一个特定值的所有行 - Remove all rows that contains the IDs in column1 that have at least one specific value on another specific column2 in a Python Dataframe 熊猫:如果其中至少有一个包含特定值,则保留行 - Pandas: Keep rows if at least one of them contains certain value 如何为在另一列 pandas 中具有相同值的那些行使一列的值相同 - How to make same value of one column for those rows which have same values in another column pandas 如何在Pandas中保留至少一列满足条件的行 - How to keep rows where at least one column satisfy a condition in Pandas 获取数组列中至少有一个相同值的所有行 - get all rows with at least one same value in array column 检查 pandas 中是否至少有一列包含字符串 - Check if at least one column contains a string in pandas 如何组合 pandas dataframe 中在一列中具有相同值的行 - How to combine rows in a pandas dataframe that have the same value in one column 如何在一个具有相同值(字符串)的数据框中找到两个连续的行,并在它们之间添加更多行? - how to find two consecutive rows in a dataframe with same value(string) for a column and add more rows between them? 我有一个带有列表的 pandas 列。 对包含来自同一列的至少一个公共元素的行进行分组 - I have a pandas column with lists. Group rows that contains atleast one common element from same column 如何比较两个相同大小的数据框并创建一个新的数据框,而在列中没有具有相同值的行 - How to compare two dataframes of the same size and create a new one without the rows that have the same value in a column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM