简体   繁体   English

如果值列表出现在pandas dataframe的任何列中,如何打印行

[英]How to print rows if a list of values appear in any column of pandas dataframe

How to print rows if values appear in any column of pandas dataframe 如果值出现在pandas dataframe的任何列中,如何打印行

I would like to print all rows of a dataframe where I find some values from a list of values in any of the columns. 我想打印一个数据帧的所有行,我从任何列的值列表中找到一些值。 The dataframe follows this structure: 数据框遵循以下结构:

1476 13/03/2013  4 10 26 37 47 57
1475 09/03/2013 12 13 37 44 48 51
1474 06/03/2013  1  2  3 11 28 43
1473 02/03/2013  2 12 33 57 58 60
1472 27/02/2013 12 18 23 25 45 50
1471 23/02/2013 10 25 33 36 40 58
1470 20/02/2013  2 34 36 38 51 55
1469 16/02/2013  4 13 35 54 56 58
1468 13/02/2013  1  2 10 19 20 37
1467 09/02/2013 23 24 26 41 52 53
1466 06/02/2013  4  6 13 34 37 51
1465 02/02/2013  6 11 16 26 44 53
1464 30/01/2013  2 24 32 50 54 59
1463 26/01/2013 13 22 28 29 40 48
1462 23/01/2013  5  9 25 27 38 40
1461 19/01/2013 31 36 44 47 49 54
1460 16/01/2013  4 14 27 38 50 52
1459 12/01/2013  2  6 30 34 35 52
1458 09/01/2013  2  4 16 33 44 51
1457 05/01/2013 15 16 34 42 46 59
1456 02/01/2013  6  8 14 26 36 40
1455 31/12/2012 14 32 33 36 41 52
1454 22/12/2012  4 27 29 41 48 52
1453 20/12/2012  6 13 25 32 47 57

First: I have a Series of values with size 3 that I get from a combinatory of 6 different values. 第一:我有一系列大小为3的值,我从6个不同的值组合得到。

Second: I have a dataframe with 2143 rows. 第二:我有一个2143行的数据帧。 I want to check if in any of these rows, I have those three values in any sort of order in the columns. 我想检查是否在这些行中的任何一行中,我在列中以任何顺序排列这三个值。

from itertools import combinations, groupby
from pandas import Series
from operator import itemgetter

inputlist = [2,12,35,51,57,58]
combined = combinations(inputlist, 3)

series = Series(list(g) for k, g in groupby(combined, key=itemgetter(0)))

Gave me this: 给我这个:

0    [(2, 12, 35), (2, 12, 51), (2, 12, 57), (2, 12...
1    [(12, 35, 51), (12, 35, 57), (12, 35, 58), (12...
2           [(35, 51, 57), (35, 51, 58), (35, 57, 58)]
3                                       [(51, 57, 58)]

I just tried the query command and this is what I've got: 我刚刚尝试了查询命令,这就是我所拥有的:

df_ordered.query('_1 == 2 & _2 == 12') df_ordered.query('_ 1 == 2&_2 == 12')

ID      DATE        _1  _2  _3  _4  _5  _6

405     2002-10-19  2   12  32  38  47  48
615     2004-11-17  2   12  16  24  26  54
732     2006-01-28  2   12  26  31  43  46
1361    2012-02-11  2   12  19  22  36  58
1472    2013-03-02  2   12  33  57  58  60
1523    2013-08-24  2   12  40  46  52  53
1711    2015-06-10  2   12  19  29  50  59
2142    2019-04-17  2   12  35  51  57  58 

Now, I want to expand the same thing, but I want to look at all those columns and find any of those values. 现在,我想扩展同样的事情,但我想查看所有这些列并找到任何这些值。

I also didn't know how to plug those series into a loop to find the values into the query statement. 我也不知道如何将这些系列插入到循环中以查找查询语句中的值。

EDIT: I tried the isin command, but I have no ideia how to expand it to the 6 columns I have. 编辑:我尝试了isin命令,但我没有想法如何将它扩展到我的6列。

df[df._1.isin(combined)]

IIUC, you could try creating a boolean mask with a list comprehension using set.issuperset , numpy.reshape and numpy.any : IIUC,您可以尝试使用set.issupersetnumpy.reshapenumpy.any创建一个带有列表set.issupersetboolean mask

import numpy as np
from itertools import combinations

inputlist = [2,12,35,51,57,58]
combined = np.array(list(combinations(inputlist, 3)))

mask = (np.array([set(row).issuperset(c) for row in df.values for c in combined])
        .reshape(len(df), -1).any(1))

print(df[mask])

[out] [OUT]

     ID        DATE  _1  _2  _3  _4  _5  _6
3  1473  02/03/2013   2  12  33  57  58  60

You can use isin in combination with any(axis=1) to retain the values: 您可以将isinany(axis=1)结合使用以保留值:

inputlist = [2,12,35,51,57,58]

df2 = df[df.iloc[:, 3:].isin(inputlist).any(axis=1)]

print(df2)
      ID        Date  _1  _2  _3  _4  _5  _6
0   1476  13/03/2013   4  10  26  37  47  57
1   1475  09/03/2013  12  13  37  44  48  51
2   1474  06/03/2013   1   2   3  11  28  43
3   1473  02/03/2013   2  12  33  57  58  60
5   1471  23/02/2013  10  25  33  36  40  58
6   1470  20/02/2013   2  34  36  38  51  55
7   1469  16/02/2013   4  13  35  54  56  58
8   1468  13/02/2013   1   2  10  19  20  37
10  1466  06/02/2013   4   6  13  34  37  51
17  1459  12/01/2013   2   6  30  34  35  52
18  1458  09/01/2013   2   4  16  33  44  51
23  1453  20/12/2012   6  13  25  32  47  57

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果值出现在pandas dataframe的任何列中,如何打印行 - How to print rows if values appear in any column of pandas dataframe 如何将带有值列表的列转换为 Pandas DataFrame 中的行 - How to convert column with list of values into rows in Pandas DataFrame pandas dataframe 按作为列表的列的值过滤行 - pandas dataframe filter rows by values of column that is a list 如何将带有值列表的列转换为 Pandas DataFrame 中的行,也包括前一列 - How to convert column with list of values into rows in Pandas DataFrame including previous column also 如何更新 Pandas Dataframe 中值列表的列 - How to update a column for list of values in Pandas Dataframe 选择列表中的Pandas DataFrame列值的所有行 - Select ALL rows where Pandas DataFrame Column values in a List 当列值与列表中的元组匹配时,删除 Pandas Dataframe 中的行 - Deleting rows in Pandas Dataframe, when column values match tuples in a list 使用 Pandas DataFrame 将列值之类的列表转换为多行 - converting list like column values into multiple rows using Pandas DataFrame 如何在 pandas Dataframe 中使用具有列值的行来匹配行和过滤 - How to match rows and filtering using rows with column values in pandas Dataframe 如何基于以列表为值的列在Pandas数据框中过滤行? - How do I filter for rows in a Pandas dataframe based on a column that has list as values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM