[英]Pandas Dataframe groupby column
I have a dataframe df which i need to groupby Department Name column我有一个 dataframe df 我需要按部门名称列分组
Input输入
Employee Name![]() |
Department Name![]() |
Subjects![]() |
Billable![]() |
Hours![]() |
Date![]() |
---|---|---|---|---|---|
Anu![]() |
CS ![]() |
Java ![]() |
Yes![]() |
8 ![]() |
01-03-2021 ![]() |
Anu![]() |
CS ![]() |
Python ![]() |
Yes![]() |
9 ![]() |
02-03-2021 ![]() |
Anu![]() |
CS ![]() |
SQL ![]() |
No![]() |
6 ![]() |
03-03-2021 ![]() |
Anu![]() |
CS ![]() |
React![]() |
Yes![]() |
5 ![]() |
03-03-2021 ![]() |
Anu![]() |
CS ![]() |
.Net ![]() |
No![]() |
8 ![]() |
04-03-2021 ![]() |
Bala![]() |
CS ![]() |
SQL ![]() |
No![]() |
5 ![]() |
01-03-2021 ![]() |
Bala![]() |
CS ![]() |
Python ![]() |
Yes![]() |
4 ![]() |
01-03-2021 ![]() |
Bala![]() |
CS ![]() |
Java ![]() |
Yes![]() |
2 ![]() |
02-03-2021 ![]() |
Bala![]() |
CS ![]() |
.Net ![]() |
No![]() |
8 ![]() |
03-03-2021 ![]() |
Bala![]() |
CS ![]() |
React![]() |
Yes![]() |
7 ![]() |
04-03-2021 ![]() |
Code代码
df = pd.pivot_table(df,index=['Department Name','Employee Name','Billable'],columns=['Subjects'],values='Hours',aggfunc={'Hours': np.sum})
# Resetting index
df = df.reset_index ()
list_column = df.columns
# Adding new columns and calculation
total = df.sum(axis=1)
df.insert(len(df.columns), column='Total', value=total)
available_col = len(df.columns)
Utilization_col = len(df.columns)
utilization_row = len(df.columns)
# Adding Available column
available = 168
df.insert(len(df.columns), column='Available', value=available)
# Adding Utilization column
utilization = (total / available)
df.insert(len(df.columns), column='Utilization', value=utilization)
# Filter dataframe using groupby
df1 = df.groupby(['Department Name','Employee Name'], sort=False ).sum(min_count=1)
df1['Available'] = available
# Adding Billable Utilization column and Non-billable Utilization column
df['Billable'] = np.where(df['Billable'] == 'Billable', 'Billable Utilization','Non Billable Utilization')
df2 = (df.groupby(['Employee Name', 'Billable Status'])[list_column].sum().sum(axis=1).unstack().div(available).mul(100)).round(2)
df = df1.join(df2).reset_index()
df.index = df.index
# Round the column value
df['Total'] = df['Total'].round(2)
df = df.groupby(['Department Name','Employee Name'], as_index=False).sum(min_count=1)
Expected Output预计 Output
Note :注意:
I tried to use reset_index, but groupby function not works.我尝试使用reset_index,但groupby function 不起作用。
I have tried making the following function and I was able to get your desired output我已经尝试制作以下 function 并且我能够获得您想要的 output
def func(x):
for i in range(1, x['Department Name'].size):
x['Department Name'].iloc[i] = ''
return x;
df['Department Name'] = df['Department Name'].apply(str)
df = df.groupby('Department
Name').apply(func).set_index('Department Name')
df.head()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.