简体   繁体   中英

Pandas: groupby by date and transform nunique returning too many entries

I am trying to do a simple group-by in Pandas and it is not working as it should:

url='https://raw.githubusercontent.com/108michael/ms_thesis/master/raw_bills'

bills=pd.read_csv(url)
bills.date.nunique()
11 
bills.dtypes

date         float64
bills         object
id.thomas      int64
dtype: object 

bills[['date', 'bills']].groupby(['date']).bills.transform('nunique')

0       3627
1       7454
2       7454
3       7454
4       3627
5       7454
6       7454
7       3627
8       7454
9       7454
10      3627
11      7454
12      7454
13      7454
14      7454
15      7454
16      3627
17      3627
18      7454

I've done this sort of group-by before, and it usually works fine.

Any suggestions on this?

I'm not sure what you're asking, but don't you want to use:

bills[['date', 'bills']].groupby('date').bills.nunique()

date
2005.0    6820
2006.0    3738
2007.0    7454
2008.0    3627
2009.0    7324
2010.0    3297
2011.0    5787
2012.0    4647
2013.0    5694
2014.0    3211
2015.0       5
Name: bills, dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM