简体   繁体   English

如何在 pandas 中获取某些列包含相同值的行?

[英]How to get rows that has certain columns containing same values in pandas?

I have a data which looks like below我有一个如下所示的数据

data = [(datetime.datetime(2021, 2, 10, 7, 49, 7, 118658), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 49, 14, 312273), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_UP'),
       (datetime.datetime(2021, 2, 10, 7, 49, 21, 535932), u'12.100.90.10', u'100.100.12.1', u'100.100.22.1', u'LT_UP'),
       (datetime.datetime(2021, 2, 10, 7, 50, 28, 264042), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'LT_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 50, 28, 725961), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'PL_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 50, 32, 450853), u'10.100.80.10', u'10.55.10.1', u'100.100.12.1', u'PL_LOW'),
       (datetime.datetime(2021, 2, 10, 7, 51, 32, 450853), u'10.10.80.10', u'10.55.10.1', u'100.100.12.1', u'MA_HIGH'),
       (datetime.datetime(2021, 2, 10, 7, 52, 34, 264042), u'10.10.80.10', u'10.55.10.1', u'10.55.10.1', u'PL_DOWN'),
]

This is how it looks on loading in pandas这是加载 pandas 时的样子

df = pd.DataFrame(data)
df.columns = ["date", "start", "end", "end2", "type"]
# drop duplicate rows
df = df.drop_duplicates()

                        date         start           end          end2     type
0 2021-02-10 07:49:07.118658  12.100.90.10  100.100.12.1  100.100.12.1  LT_DOWN
1 2021-02-10 07:49:14.312273  12.100.90.10  100.100.12.1  100.100.12.1    LT_UP
2 2021-02-10 07:49:21.535932  12.100.90.10  100.100.12.1  100.100.22.1    LT_UP
3 2021-02-10 07:50:28.264042  12.100.90.10  100.100.12.1  100.100.32.1  LT_DOWN
4 2021-02-10 07:50:28.725961  12.100.90.10  100.100.12.1  100.100.32.1  PL_DOWN
5 2021-02-10 07:50:32.450853  10.100.80.10    10.55.10.1  100.100.12.1   PL_LOW
6 2021-02-10 07:51:32.450853   10.10.80.10    10.55.10.1  100.100.12.1  MA_HIGH
7 2021-02-10 07:52:34.264042   10.10.80.10    10.55.10.1   100.55.10.1  PL_DOWN

Now I only want to select rows that have end and end2 columns containing same values.现在我只想 select 行的endend2列包含相同的值。 So my output would be所以我的 output 将是

                        date         start           end          end2     type
0 2021-02-10 07:49:07.118658  12.100.90.10  100.100.12.1  100.100.12.1  LT_DOWN
1 2021-02-10 07:49:14.312273  12.100.90.10  100.100.12.1  100.100.12.1    LT_UP
2 2021-02-10 07:52:34.264042   10.10.80.10    10.55.10.1    10.55.10.1  PL_DOWN

Now according to this question on stackoverflow Get rows that have the same value across its columns in pandas I could do this to check for similar values across all columns.现在根据stackoverflow上的this question Get rows that have the same value across its columns in pandas我可以这样做来检查所有列中的相似值。

df[df.apply(pd.Series.nunique, axis=1) == 1]

But for my case I want this check limited to certain columns only.但就我而言,我希望此检查仅限于某些列。

How do I do this?我该怎么做呢?

Just use masking.只需使用遮罩。

df[df.end == df.end2]
df = df.loc[(df['end'] == df['end2'])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM