简体   繁体   English

在熊猫数据框中订购一列

[英]Ordering a column in pandas dataframe

I need to create a dataframe filtering out the five most frequently listed countries in the Nationality column and the total amount of times they are listed.我需要创建一个数据框,过滤掉 Nationality 列中五个最常列出的国家以及它们列出的总次数。 I've been trying to use groupby, but have been unsuccessful.我一直在尝试使用 groupby,但没有成功。 The code i've used it我用过的代码

df.groupby(['Nationality']).sum() 

I also need to determine what percent of those listed as participating in the program have at least one referral.我还需要确定被列为参与该计划的人中至少有一个推荐人的百分比。 I'm not sure the code for this either though.我也不确定这个代码。

这是数据框的一部分

检查这个问题及其答案,它与您要求的相似。

Filter out rows which contain Nationality that is in top 5 nationalities:过滤掉包含在前 5 个国籍中的国籍的行:

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6]) == False]

See how many times they're listed by looking at shape of df where rows contain Nationality that is in top 5:通过查看 df 的形状来查看它们被列出的次数,其中行包含前 5 名的国籍:

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6])].shape

Quick way to see what percent of Number_of_Referalls has value > or = to 1:查看 Number_of_Referalls 值 > 或 = 为 1 的百分比的快速方法:

(df['Number_of_Referalls '] >= 1).value_counts(normalize=True) * 100

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM