[英]From a dataframe, how to groupby a column and append the result to create a new dataframe?
請看一下我的代碼好嗎?
這是(部分)我的 DF
display(count_df)
title
0 Programmer
1 Oracle Fusion/ EBS Developer
2 Software Engineer RH-22-07
3 Web Developer E-Commerce
4 Software Engineer
5 Junior Front-End Developer/Designer
6 Programmer Analyst (Giampaolo Group - Hybrid W...
7 Full Stack Developer
8 SAS/SQL Programmer
9 AWS Cloud Architect
10 Full Stack Software Engineer
11 Full Stack .Net Developer (Independent Contrac...
這是我得到的結果
#Create a new column called 'counts', and then group by the times the title appears within the column
count_df['counts'] = count_df.groupby('title')['title'].transform('count')
print(count_df.value_counts())
title counts
AWS Cloud Architect 4 4
Programmer 4 4
Web Developer E-Commerce 4 4
Software Engineer 4 4
SAS/SQL Programmer 4 4
Full Stack .Net Developer (Independent Contractor) 4 4
Web developer 4 4
Junior Front-End Developer/Designer 4 4
Intermediate Software Developer 4 4
Full Stack Software Engineer 4 4
Full Stack Developer 4 4
Junior Java Developer (Recent Graduate) 3 3
Software Developer 3 3
Software Developer, Co-Op (Sept-Dec) 3 3
Software Engineer with Test 3 3
Oracle Fusion/ EBS Developer 1 1
React Developer (3 Month Contract) 1 1
Software Engineer RH-22-07 1 1
Programmer Analyst (Giampaolo Group - Hybrid Work From Home) 1 1
最后,我需要的結果就是這個,但是作為一個新的DF。
title counts
AWS Cloud Architect 4
Programmer 4
Web Developer E-Commerce 4
Software Engineer 4
SAS/SQL Programmer 4
Full Stack .Net Developer (Independent Contractor) 4
Web developer 4
Junior Front-End Developer/Designer 4
Intermediate Software Developer 4
Full Stack Software Engineer 4
Full Stack Developer 4
Junior Java Developer (Recent Graduate) 3
Software Developer 3
Software Developer, Co-Op (Sept-Dec) 3
Software Engineer with Test 3
Oracle Fusion/ EBS Developer 1
React Developer (3 Month Contract) 1
Software Engineer RH-22-07 1
Programmer Analyst (Giampaolo Group - Hybrid Work From Home) 1
我認為復制“計數”列的是代碼“print(count_df.value_counts())”的這一部分,但沒有它,我不會得到預期的結果。
謝謝你。
我認為重復這些列的原因是您將結果分配給 DataFrame 的新列而不是新的 DataFrame。
嘗試這個:
count_df2 = count_df.groupby('title')['title'].size().sort_values(ascending=False).reset_index(name='counts')
或這個:
count_df2 = count_df.groupby('title')['title'].agg('count').sort_values(ascending=False).reset_index(name='counts')
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.