选择具有特定值pandas的特定列

Question

So I have a data frame of 30 columns and I want to filter it for values found in 10 of those columns and return all the rows that match. 所以我有一个30列的数据框，我想过滤它们在10个列中找到的值并返回匹配的所有行。 In the example below, I want to search for values equal to 1 in all df columns that end with "good..." 在下面的示例中，我想在所有df列中搜索等于1的值，以“good ...”结尾

df[df[[i for i in df.columns if i.endswith('good')]].isin([1])]

df[df[[i for i in df.columns if i.endswith('good')]] == 1]

Both of these work to find those columns but everything that does not match appears as NaN. 这两个都可以找到这些列，但不匹配的所有内容都显示为NaN。 My question is how can I query specific columns for specific values and have all the rows that don't match not appear as NaN? 我的问题是如何查询特定列的特定值，并使所有不匹配的行不显示为NaN？

Answer 1

You can filter columns first with str.endswith , select columns by [] and compare by eq . 您可以先使用str.endswith过滤列，按[]选择列，然后按eq进行比较。 Last add any for at least one 1 per row 最后添加any至少每行1个

cols = df.columns[df.columns.str.endswith('good')]
df1 = df[df[cols].eq(1).any(axis=1)]

Sample: 样品：

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[1,1,4,5,5,1],
                   'C good':[7,8,9,4,2,3],
                   'D good':[1,3,5,7,1,0],
                   'E good':[5,3,6,9,2,1],
                   'F':list('aaabbb')})

print (df)
   A  B  C good  D good  E good  F
0  a  1       7       1       5  a
1  b  1       8       3       3  a
2  c  4       9       5       6  a
3  d  5       4       7       9  b
4  e  5       2       1       2  b
5  f  1       3       0       1  b

cols = df.columns[df.columns.str.endswith('good')]

print (df[cols].eq(1))
   C good  D good  E good
0   False    True   False
1   False   False   False
2   False   False   False
3   False   False   False
4   False    True   False
5   False   False    True

df1 = df[df[cols].eq(1).any(1)]
print (df1)
   A  B  C good  D good  E good  F
0  a  1       7       1       5  a
4  e  5       2       1       2  b
5  f  1       3       0       1  b

You solution was really close, only add any : 你的解决方案非常接近，只添加any ：

df1 = df[df[[i for i in df.columns if i.endswith('good')]].isin([1]).any(axis=1)]
print (df1)
   A  B  C good  D good  E good  F
0  a  1       7       1       5  a
4  e  5       2       1       2  b
5  f  1       3       0       1  b

EDIT: 编辑：

If need only 1 and all another rows and columns remove: 如果只需要1 ，则删除所有其他行和列：

df1 = df.loc[:, df.columns.str.endswith('good')]
df2 = df1.loc[df1.eq(1).any(1), df1.eq(1).any(0)]
print (df2)
   D good  E good
0       1       5
4       1       2
5       0       1

选择具有特定值pandas的特定列

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-07-26 14:02:57

选择具有特定值pandas的特定列

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-07-26 14:02:57

解决方案1
3 已采纳 2017-07-26 14:02:57