简体   繁体   English

仅获取在 pandas 中任何给定年份重复两次或更多次的数据

[英]Get only data that are repeated any the given year two or more times in pandas

Below is the Raw Data.以下是原始数据。

Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April
Event3  2012    January
Event3  2015    March
Event4  2013    February

So only get those Event data that are occurred two or more times in given list of year ie [2011,2012,2013,2014].因此,仅获取在给定年份列表中发生两次或多次的事件数据,即 [2011,2012,2013,2014]。

So Output should be.所以输出应该是。

    Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April

First filter rows by list in Series.isin with boolean indexing and then are filtered duplicated events by DataFrame.duplicated , last filter original column Event :首先使用boolean indexingSeries.isin中按列表过滤行,然后通过DataFrame.duplicated过滤重复事件,最后过滤原始列Event

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

df = df[df['Event'].isin(df1.loc[df1.duplicated(['Event']),'Event'])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Or you can test if greater or equal 2 is number of filtered events :或者您可以测试是否大于或等于2是过滤events的数量:

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

s = df1['Event'].value_counts()
df = df[df['Event'].isin(s.index[s.ge(2)])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Use isin to filter years in the list.使用 isin 过滤列表中的年份。 Groupby count and filter those greater than or equals to 2 Groupby 对大于等于 2 的计数和过滤

s=df[df['Year'].astype(str).isin(lst)]
s[s.groupby('Event')['Month'].transform('count').ge(2)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM