熊猫-计数组

Question

I am trying to get a group of the most popular names by country using pandas. 我正在尝试使用熊猫按国家列出一组最受欢迎的名称。 I have gotten half way as seen in the snippet but I am unclear how to convert groupedByCountry into a sorted table. 如片段所示，我已经完成一半，但是我不清楚如何将groupedByCountry转换为排序表。

import math
import pandas
csv = pandas.read_csv("./name_country.csv.gz", compression="gzip")

data = csv[["name",'country']]

filtered = roleIni[data.country.notnull()]

groupedByCountry = filtered.groupby("country")

Answer 1

You can use groupby size and then use nlargest : 您可以使用groupby size ，然后使用nlargest ：

In [11]: df = pd.DataFrame([["andy", "GB"], ["bob", "US"], ["chris", "GB"]], columns=["name", "country"])

In [12]: df.groupby("country").size().nlargest(1)
Out[12]:
country
GB    2
dtype: int64

It's probably more efficient however to do a direct value_counts on the column, and then take the head ( head(n) will get the top n most popular countries): 但是，在列上直接进行value_counts ，然后采用head （ head(n)将获得最受欢迎的前n个国家/地区）可能会更有效：

In [21]: df["country"].value_counts().head(1)
Out[21]:
GB    2
Name: country, dtype: int64

熊猫-计数组

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-11-06 18:23:39

熊猫-计数组

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-11-06 18:23:39

解决方案1
0 已采纳 2015-11-06 18:23:39