[英]pandas operations inside a for-loop
Here is a sample of my data这是我的数据示例
threats =
binomial_name
continent threat_type
Africa Agriculture & Aquaculture 143
Biological Resource Use 102
Climate Change 3
Commercial Development 36
Energy Production & Mining 30
... ... ...
South America Human Intrusions 1
Invasive Species 3
Natural System Modifications 1
Transportation Corridor 2
Unknown 38
I want to use a for loop and obtain and append together the top 5 values of each continent into a data frame.我想使用 for 循环并获取每个大陆的前 5 个值并将其附加到一个数据框中。 Here is my code -
这是我的代码 -
continents = threats.continent.unique()
for i in continents:
continen = (threats
.query('continent == i')
.groupby(['continent','threat_type'])
.sort_values(by=('binomial_name'), ascending=False).
.head())
top5 = appended_data.append(continen)
I am however getting the error - KeyError: 'i'
然而,我收到错误 -
KeyError: 'i'
Where am I going wrong?我哪里错了?
So, the canonical way to do this:因此,执行此操作的规范方法是:
df.groupby('continent', as_index=False).apply(
lambda grp: grp.nlargest(5, 'binomial_value'))
If you want to do this in a loop, replace this part:如果要循环执行此操作,请替换此部分:
for i in continents:
continen = threats[threats['continent'] == i].nlargest(2, 'binomial_name')
appended_data.append(continen)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.