简体   繁体   中英

Ordering a column in pandas dataframe

I need to create a dataframe filtering out the five most frequently listed countries in the Nationality column and the total amount of times they are listed. I've been trying to use groupby, but have been unsuccessful. The code i've used it

df.groupby(['Nationality']).sum() 

I also need to determine what percent of those listed as participating in the program have at least one referral. I'm not sure the code for this either though.

这是数据框的一部分

检查这个问题及其答案,它与您要求的相似。

Filter out rows which contain Nationality that is in top 5 nationalities:

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6]) == False]

See how many times they're listed by looking at shape of df where rows contain Nationality that is in top 5:

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6])].shape

Quick way to see what percent of Number_of_Referalls has value > or = to 1:

(df['Number_of_Referalls '] >= 1).value_counts(normalize=True) * 100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM