仅获取在 pandas 中任何给定年份重复两次或更多次的数据

Question

Below is the Raw Data.以下是原始数据。

Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April
Event3  2012    January
Event3  2015    March
Event4  2013    February

So only get those Event data that are occurred two or more times in given list of year ie [2011,2012,2013,2014].因此，仅获取在给定年份列表中发生两次或多次的事件数据，即 [2011,2012,2013,2014]。

So Output should be.所以输出应该是。

    Event   Year    Month
Event1  2011    January
Event1  2012    January
Event1  2013    February
Event1  2014    January
Event1  2015    March
Event2  2011    January
Event2  2014    April

Answer 1

First filter rows by list in Series.isin with boolean indexing and then are filtered duplicated events by DataFrame.duplicated , last filter original column Event :首先使用boolean indexing在Series.isin中按列表过滤行，然后通过DataFrame.duplicated过滤重复事件，最后过滤原始列Event ：

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

df = df[df['Event'].isin(df1.loc[df1.duplicated(['Event']),'Event'])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Or you can test if greater or equal 2 is number of filtered events :或者您可以测试是否大于或等于2是过滤events的数量：

L = [2011,2012,2013,2014]
df1 = df.loc[df['Year'].isin(L)]

s = df1['Event'].value_counts()
df = df[df['Event'].isin(s.index[s.ge(2)])]

print (df)
    Event  Year     Month
0  Event1  2011   January
1  Event1  2012   January
2  Event1  2013  February
3  Event1  2014   January
4  Event1  2015     March
5  Event2  2011   January
6  Event2  2014     April

Answer 2

Use isin to filter years in the list.使用 isin 过滤列表中的年份。 Groupby count and filter those greater than or equals to 2 Groupby 对大于等于 2 的计数和过滤

s=df[df['Year'].astype(str).isin(lst)]
s[s.groupby('Event')['Month'].transform('count').ge(2)]

仅获取在 pandas 中任何给定年份重复两次或更多次的数据

问题描述

2 个解决方案

解决方案1
1 2022-06-10 11:00:27

解决方案2
0 2022-06-10 10:59:41

仅获取在 pandas 中任何给定年份重复两次或更多次的数据

问题描述

2 个解决方案

解决方案1 1 2022-06-10 11:00:27

解决方案2 0 2022-06-10 10:59:41

解决方案1
1 2022-06-10 11:00:27

解决方案2
0 2022-06-10 10:59:41