在熊猫数据框中订购一列

Question

I need to create a dataframe filtering out the five most frequently listed countries in the Nationality column and the total amount of times they are listed.我需要创建一个数据框，过滤掉 Nationality 列中五个最常列出的国家以及它们列出的总次数。 I've been trying to use groupby, but have been unsuccessful.我一直在尝试使用 groupby，但没有成功。 The code i've used it我用过的代码

df.groupby(['Nationality']).sum()

I also need to determine what percent of those listed as participating in the program have at least one referral.我还需要确定被列为参与该计划的人中至少有一个推荐人的百分比。 I'm not sure the code for this either though.我也不确定这个代码。

Answer 1

检查这个问题及其答案，它与您要求的相似。

Answer 2

Filter out rows which contain Nationality that is in top 5 nationalities:过滤掉包含在前 5 个国籍中的国籍的行：

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6]) == False]

See how many times they're listed by looking at shape of df where rows contain Nationality that is in top 5:通过查看 df 的形状来查看它们被列出的次数，其中行包含前 5 名的国籍：

df[df['Nationality'].isin(df['Nationality'].value_counts().index[:6])].shape

Quick way to see what percent of Number_of_Referalls has value > or = to 1:查看 Number_of_Referalls 值 > 或 = 为 1 的百分比的快速方法：

(df['Number_of_Referalls '] >= 1).value_counts(normalize=True) * 100

在熊猫数据框中订购一列

问题描述

1 个解决方案

解决方案1
0 2021-10-21 23:09:11

解决方案2
0 2021-10-22 00:12:34

在熊猫数据框中订购一列

问题描述

1 个解决方案

解决方案1 0 2021-10-21 23:09:11

解决方案2 0 2021-10-22 00:12:34

解决方案1
0 2021-10-21 23:09:11

解决方案2
0 2021-10-22 00:12:34