I need to add column to current dataframe from excel file, that counts how many time model from dataframe == 'HIT' or "HITTOP'. I have 2 columns dataframe (Model, HK). HK column contain the HIT or HITTOP strings. Below is the code, i made a counter but it only counts if model have non empty string on HK column. Dataframe have models from many files so it have duplicates thats why i need counter on specific condition.
import pandas as pd
df = pd.read_excel(r'C:\Users\user\Desktop\test\output.xlsx')
df['count'] = df.groupby('Model')['HK'].transform('count') #add the count column that counts non empty strings from HK column
df.to_excel(r'C:\Users\user\Desktop\test\output3.xlsx') #save the output
Sample data:
d = {'Model': ['model1', 'model2',' model1', 'model1', 'model2'], 'HK': ['HITTOP', 'HIT', "HITTOP", '', '']}
df = pd.DataFrame(data=d)
df
Model HK
0 model1 HITTOP
1 model2 HIT
2 model1 HITTOP
3 model1
4 model2
Desired output:
f = {'Model': ['model1', 'model2',' model1', 'model1', 'model2'], 'HK': ['HITTOP', 'HIT', "HITTOP", '', ''],
'Count': ['2', '1', "2", '2', '1']}
df = pd.DataFrame(data=f)
df
Model HK Count
0 model1 HITTOP 2
1 model2 HIT 1
2 model1 HITTOP 2
3 model1 2
4 model2
df = df.fillna('')
df2 = df.groupby('HK').apply
(lambda x: x.shape[0]).rename('Count').reset_index()
df = df.merge(df2, how='left')
Model HK Count
0 model1 HITTOP 2
1 model2 HIT 1
2 model1 HITTOP 2
3 model1 2
4 model2 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.