yearCount = df[['antibiotic', 'order_date', 'antiYearCount']]
yearGroups = yearCount.groupby('order_date')
for year in yearGroups:
yearCount['antiYearCount'] =year.groupby('antibiotic'['antibiotic'].transform(pd.Series.value_counts)
In this case, yearCount
is a dataframe containing 'order_date', 'antibiotic', 'antiYearCount'
. I have cleaned 'order_date'
to only contain the year of the order. I want to group yearCount
by the years in 'order_date'
, count the number of times each 'antibiotic'
appears in each "year group" then assign that value to yearCount
's 'antiYearCount'
variable.
I think you need add new column order_date
to groupby
and then is also possible use size
instead pd.Series.value_counts
for same output:
df = pd.DataFrame({'antibiotic':list('accbbb'),
'antiYearCount':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'order_date': pd.to_datetime(['2012-01-01']*3+['2012-01-02']*3)})
print (df)
C D E antiYearCount antibiotic order_date
0 7 1 5 4 a 2012-01-01
1 8 3 3 5 c 2012-01-01
2 9 5 6 4 c 2012-01-01
3 4 7 9 5 b 2012-01-02
4 2 1 2 5 b 2012-01-02
5 3 0 4 4 b 2012-01-02
#copy for remove warning
#https://stackoverflow.com/a/45035966/2901002
yearCount = df[['antibiotic', 'order_date', 'antiYearCount']].copy()
yearCount['antiYearCount'] = yearCount.groupby(['order_date','antibiotic'])['antibiotic'] \
.transform('size')
print (yearCount)
antibiotic order_date antiYearCount
0 a 2012-01-01 1
1 c 2012-01-01 2
2 c 2012-01-01 2
3 b 2012-01-02 3
4 b 2012-01-02 3
5 b 2012-01-02 3
yearCount['antiYearCount'] = yearCount.groupby(['order_date','antibiotic'])['antibiotic'] \
.transform(pd.Series.value_counts)
print (yearCount)
antibiotic order_date antiYearCount
0 a 2012-01-01 1
1 c 2012-01-01 2
2 c 2012-01-01 2
3 b 2012-01-02 3
4 b 2012-01-02 3
5 b 2012-01-02 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.