Hi I have a datframe that look like this
Date Col1 Col2 Col3 col4
0 1-4-2020 a ed kv sc sf ds n
1 1-4-2020 b hed sf ed df hed c
2 1-4-2020 c df xz sf sf ds n
3 2-4-2020 a ed df hed sf ds n
4 2-4-2020 b gv bb sc s ds c
5 2-4-2020 c ed sf sc hed ds n
What I want to do is to filter the rows with multiple conditions. If:
So the result will look like this:
Date Col1 Col2 Col3 col4
1 1-4-2020 b hed sf ed df hed c
3 2-4-2020 a ed df hed sf djs n
5 2-4-2020 c ed sf ssc hed djs n
I hope it is clear Update
In case Col4
Date Col1 Col2 Col3 col4
0 1-4-2020 a ed kv sc sf ds n
1 1-4-2020 b hed sf ed df hed c
2 1-4-2020 c df xz sf sf ds n
3 1-4-2020 d df xdg sf sf hed y
4 2-4-2020 a ed df hed sf ds n
5 2-4-2020 b gv bb sc s ds c
6 2-4-2020 c ed sf sc hed ds n
7 2-4-2020 d df xgfgf gf gfd null
In case that the value in Col4 appears only once in the dataframe like in row 3 (hint no next row to compare with) the function should return the row.
Date Col1 Col2 Col3 col4
1 1-4-2020 b hed sf ed df hed c
3 1-4-2020 d df xdg sf sf hed y
4 2-4-2020 a ed df hed sf ds n
6 2-4-2020 c ed sf sc hed ds n
First get counts of rows by same Col1
and Col4
by GroupBy.transform
with GroupBy.size
and join column Col2
with Col3
toegether:
count = df.groupby(['Col1','col4'])['col4'].transform('size')
s = (df['Col2'] + ' ' + df['Col3'])
You can verify output:
print (df.assign(count=count, both=s))
Date Col1 Col2 Col3 col4 count both
0 1-4-2020 a ed kv sc sf ds n 2 ed kv sc sf ds
1 1-4-2020 b hed sf ed df hed c 2 hed sf ed df hed
2 1-4-2020 c df xz sf sf ds n 2 df xz sf sf ds
3 2-4-2020 a ed df hed sf ds n 2 ed df hed sf ds
4 2-4-2020 b gv bb sc s ds c 2 gv bb sc s ds
5 2-4-2020 c ed gf sc hed ds n 2 ed gf sc hed ds
Last filter by boolean indexing
if counts are 2
by Series.eq
, then Series.str.contains
for test substrings and chained mask together by &
for bitwise AND:
df = df[count.eq(2) & s.str.contains('hed') & s.str.contains('sf')]
print (df)
Date Col1 Col2 Col3 col4
1 1-4-2020 b hed sf ed df hed c
3 2-4-2020 a ed df hed sf ds n
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.