简体   繁体   中英

Get only data that are repeated any the given year two or more times in pandas

Below is the Raw Data.

Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April
Event3  2012    January
Event3  2015    March
Event4  2013    February

So only get those Event data that are occurred two or more times in given list of year ie [2011,2012,2013,2014].

So Output should be.

    Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April

First filter rows by list in Series.isin with boolean indexing and then are filtered duplicated events by DataFrame.duplicated , last filter original column Event :

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

df = df[df['Event'].isin(df1.loc[df1.duplicated(['Event']),'Event'])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Or you can test if greater or equal 2 is number of filtered events :

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

s = df1['Event'].value_counts()
df = df[df['Event'].isin(s.index[s.ge(2)])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Use isin to filter years in the list. Groupby count and filter those greater than or equals to 2

s=df[df['Year'].astype(str).isin(lst)]
s[s.groupby('Event')['Month'].transform('count').ge(2)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM