I have a data which looks like below
data = [(datetime.datetime(2021, 2, 10, 7, 49, 7, 118658), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_DOWN'),
(datetime.datetime(2021, 2, 10, 7, 49, 14, 312273), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_UP'),
(datetime.datetime(2021, 2, 10, 7, 49, 21, 535932), u'12.100.90.10', u'100.100.12.1', u'100.100.22.1', u'LT_UP'),
(datetime.datetime(2021, 2, 10, 7, 50, 28, 264042), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'LT_DOWN'),
(datetime.datetime(2021, 2, 10, 7, 50, 28, 725961), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'PL_DOWN'),
(datetime.datetime(2021, 2, 10, 7, 50, 32, 450853), u'10.100.80.10', u'10.55.10.1', u'100.100.12.1', u'PL_LOW'),
(datetime.datetime(2021, 2, 10, 7, 51, 32, 450853), u'10.10.80.10', u'10.55.10.1', u'100.100.12.1', u'MA_HIGH'),
(datetime.datetime(2021, 2, 10, 7, 52, 34, 264042), u'10.10.80.10', u'10.55.10.1', u'10.55.10.1', u'PL_DOWN'),
]
This is how it looks on loading in pandas
df = pd.DataFrame(data)
df.columns = ["date", "start", "end", "end2", "type"]
# drop duplicate rows
df = df.drop_duplicates()
date start end end2 type
0 2021-02-10 07:49:07.118658 12.100.90.10 100.100.12.1 100.100.12.1 LT_DOWN
1 2021-02-10 07:49:14.312273 12.100.90.10 100.100.12.1 100.100.12.1 LT_UP
2 2021-02-10 07:49:21.535932 12.100.90.10 100.100.12.1 100.100.22.1 LT_UP
3 2021-02-10 07:50:28.264042 12.100.90.10 100.100.12.1 100.100.32.1 LT_DOWN
4 2021-02-10 07:50:28.725961 12.100.90.10 100.100.12.1 100.100.32.1 PL_DOWN
5 2021-02-10 07:50:32.450853 10.100.80.10 10.55.10.1 100.100.12.1 PL_LOW
6 2021-02-10 07:51:32.450853 10.10.80.10 10.55.10.1 100.100.12.1 MA_HIGH
7 2021-02-10 07:52:34.264042 10.10.80.10 10.55.10.1 100.55.10.1 PL_DOWN
Now I only want to select rows that have end
and end2
columns containing same values. So my output would be
date start end end2 type
0 2021-02-10 07:49:07.118658 12.100.90.10 100.100.12.1 100.100.12.1 LT_DOWN
1 2021-02-10 07:49:14.312273 12.100.90.10 100.100.12.1 100.100.12.1 LT_UP
2 2021-02-10 07:52:34.264042 10.10.80.10 10.55.10.1 10.55.10.1 PL_DOWN
Now according to this question on stackoverflow Get rows that have the same value across its columns in pandas I could do this to check for similar values across all columns.
df[df.apply(pd.Series.nunique, axis=1) == 1]
But for my case I want this check limited to certain columns only.
How do I do this?
Just use masking.
df[df.end == df.end2]
df = df.loc[(df['end'] == df['end2'])]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.